google_analytics.png I guess everybody that has to do something with websites knows the possibilities of google analytics. I am also sure that not a lot of people know all possibilities. This is post is about one of those possibilities that you might never heard of. This post is about extracting data from your google analytics account. That way the data can be used in your own application. When you do learn about the possibilities, imagine the power you have with that knowledge. One idea is to create a Hippo CMS plugin. Actually Jeroen Reijn has already started with a project for this at the hippo forge.

Within this post I create a component that makes it easy to read certain data from your account. While creating this component I’ll refer to available information online (mostly from google) and explain what I have done. After reading the post you’ll be able to create your own integration with a breeze.

Introduction

When you open up your google analytics account you get a lot of report options. You can look at the visitors that come to your website, the content that is looked at, earned money with adsense. Next to the type of data to look at, you can group this data by for instance geographic location and date. In googles own terms, you can specify Dimensions and Metrics.

Metrics are the actual data like visitors, pageviews, visits. These metrics can be segmented using dimensions. Think about looking at the visitors per month in the year 2008. You select the begin and end date as well as the Dimension month. Other examples of Dimension are: country, pageTitle and searchKeyword.

There is more to do with the results, you can sort them and you can filter them. Sorting is not to hard to imagine. Filtering is also intuitive, but very powerfull. You can for instance create a filter that only firefox browsers are shown in your results. Or only visitors from the Netherlands.

If you want to learn more about generic google analytics features I suggest you start at the home page and work your way through all available information, which is a lot.

Before you start creating your own component that constantly pulls data from google I do want to mention the Quota policy. It is a fair but limited policy. To give an idea, you can do a maximum of 10,000 requests per day. To find out more about the policy, check this page.

Reference :
http://code.google.com/apis/analytics/

Getting started

Time to jump in. Well, of course you do need to have a google analytics account. Actually that is called a profile. You use your regular gmail account, but you need to have access rights to a profile. To be able to create new profiles, you can register yourself. There is a catch, this process can take up one or two days. Before you can actually retrieve data, it needs to be gathered first. Check in the normal website if data is available. If that is the case, you can continue.

You can find out more on getting started by reading this page.

Setting up the project (dependencies)

Oke, you have access to your account, now you want to start creating your application. There are a few things to do to be able to create that nice app you thought out. Start by downloading the right software.

The last download contains all the libraries you need. Be sure to check-out the very extended documentation in the developer guide.

Create a new java project in your favorite IDE and add the following jars to your classpath. All jars are povided in the download

  • gdata-base-1.0.jar
  • gdata-analytics-meta-1.0.jar
  • gdata-analytics-1.0.jar
  • gdata-client-meta-1.0.jar
  • gdata-client-1.0.jar
  • google-collect-1.0-rc1.jar
  • gdata-core-1.0.jar

That is it, now we can start coding. We start with a small application that obtains your profiles based on provided username and password.

Check your account

The most important class while interacting with google analytics is: com.google.gdata.client.analytics.AnalyticsService. We need to set the credentials, than we can obtain the profiles as well as execute data queries. To make things cleaner, I have created a small wrapper class for the mentioned AnalyticsService class. The following code block shows this class. It contains a constructor that accepts the username and password of the account to use. The method obtainProfiles returns a collection of Profile objects. This Profile object is a simple container for the title and the identifying key of a profile. You need the key to obtain the data. The other method obtainQueryResults accepts a DataQuery instance and returns the result of executing this query.

public class AnalyticsServiceWrapper {
    public static final String ACCOUNT_URL = "https://www.google.com/analytics/feeds/accounts/default";

    private static Logger log = Logger.getLogger(DataQueryBuilder.class.getName());

    private AnalyticsService analyticsService;

    public AnalyticsServiceWrapper(String username, String password) {
        analyticsService = new AnalyticsService("gridshore-analytics");
        try {
            analyticsService.setUserCredentials(username, password);
        } catch (AuthenticationException e) {
            log.log(Level.SEVERE, "problem while setting username/password combination", e);
            throw new AnalyticsServiceException("Problem while registering with provided credentials");
        }
    }

    public List<Profile> obtainProfiles() {
        List<Profile> profiles = new ArrayList<Profile>();
        AccountFeed accountFeed = null;
        try {
            accountFeed = analyticsService.getFeed(new URL(ACCOUNT_URL), AccountFeed.class);
        } catch (IOException e) {
            log.log(Level.WARNING, "Problem while obtaining account information", e);
            throw new AnalyticsServiceException("IO problem while ontaining account information", e);
        } catch (ServiceException e) {
            log.log(Level.WARNING, "Problem while obtaining account information", e);
            throw new AnalyticsServiceException("Service problem while ontaining account information", e);
        }
        for (AccountEntry accountEntry : accountFeed.getEntries()) {
            profiles.add(new Profile(accountEntry.getTitle().getPlainText(), accountEntry.getTableId().getValue()));
        }
        return profiles;
    }

    public DataFeed obtainQueryResults(DataQuery query) {
        try {
            return analyticsService.getFeed(query, DataFeed.class);
        } catch (IOException e) {
            log.log(Level.WARNING, "IO problem while executing query", e);
            throw new AnalyticsServiceException("Problem while executing query", e);
        } catch (ServiceException e) {
            log.log(Level.WARNING, "Service problem while executing query", e);
            throw new AnalyticsServiceException("Problem while executing query", e);
        }

    }
}

Now how a look at the following code that uses this wrapper, obtains the profiles and prints them to System.out .

public class Application {
    public static void main(String[] args) {
        if (args.length != 2) {
            System.out.println("Not enough arguments supplied:");
            System.out.println("arg 1 : username");
            System.out.println("arg 2 : password");
            return;
        }

        String username = args[0];
        String password = args[1];

        AnalyticsServiceWrapper serviceWrapper = new AnalyticsServiceWrapper(username, password);

        List<Profile> profiles = serviceWrapper.obtainProfiles();
        printProfileInformation(profiles);
    }

    private static void printProfileInformation(List<Profile> profiles) {
        for (Profile profile : profiles) {
            System.out.println("Title : " + profile.getTitle() + " key : " + profile.getUniqueId());
        }
    }
}

The result of calling this class with my credentials is:

Title : www.gridshore.nl key : ga:181998
Title : code.google.com key : ga:7760731
Title : www.coenradie.com key : ga:13887471
Title : code.google.com/p/osgi-samples/ key : ga:14878123
Title : www.jettro.net key : ga:17862866

Now we have the profiles available that we want to use to extract data. I have created a small function to obtain the gridshore profile, but let’s focus on creating the query first.

Query builder

The client api comes with the class com.google.gdata.client.analytics.DataQuery. This class is to help you create a query. The query contains a lot of parameters. Next to that, you have to know all Destinations and Metrics yourself. Therefore I have created a QueryBuilder, which makes use of the DataQuery object. This QueryBuilder is created using a fluent interface. You can add dimensions, metrics, filters and sorting using an intuitive interface.

A fluent interface cannot use a normal constructor, since this does not return the actual object. Therefore we use a factory method. In our case we have two, one that accepts an array of string arguments containing ids of the profiles to use. The other method accepts an array of Profile objects. The effect is the same, an initialized builder instance that can be used to setup the query object. The following code block shows an example of creating a query using this class. The method is taken from the Application class as shown before

public static DataQuery createQuery(Profile profile) {
    return DataQueryBuilder.newBuilder(profile)
            .startDate("2009-05-01")
            .endDate("2009-05-31")
            .dimension(Dimension.day)
            .metric(Metric.visits)
            .filters("ga:visits>=500")
            .create();
}

The following code block shows the implementation of the factory method

public static DataQueryBuilder newBuilder(String... ids) {
    if (ids == null || ids.length == 0) {
        log.warning("The provided ids collection is empty which is not valid");
        throw new DataQueryBuilderException("Empty array of ids is provided");
    }
    for(String id : ids) {
        if ("".equals(id)) {
            throw new DataQueryBuilderException("An empty Id is provided, this is not allowed");
        }
    }
    return new DataQueryBuilder(createUrl(DATA_URL),ids);
}

Now look at the metrics. The current query shows only the visits per day. To shorten the list I have added a filter to show only days with visits more than 500. The output now is:

Dag: 05 520
Dag: 11 523
Dag: 12 550
Dag: 13 501
Dag: 14 514
Dag: 27 537
Dag: 28 597

What If we want to know the amount of unique visitors, we only have to change one line of code:
.metric(Metric.visits) becomes .metric(Metric.visits,Metric.visitors). The output than becomes: (oke I changed the print statement as well)

Dag: 05 520 466
Dag: 11 523 450
Dag: 12 550 488
Dag: 13 501 442
Dag: 14 514 455
Dag: 27 537 474
Dag: 28 597 533

Look at the implementation of the metrics(Metric… metrics) method. This method accepts a vararch of Metric objects. The Metric object is an Enumeration of all Metric options. The code of the method looks like this:

public DataQueryBuilder metric(Metric... newMetrics) {
    metrics.addAll(Arrays.asList(newMetrics));
    return this;
}

There are other methods that do more or less the same. For the final part, look at the create() method of the builder. This method creates a DataQuery by reading the properties from the builder and setting them in the DataQuery object. For the Lists we have a utility method that creates the string as required by the DataQuery object. Now, look at the code

public DataQuery create() {
    DataQuery dataQuery = new DataQuery(baseUrl);
    dataQuery.setStartDate(startDate);
    dataQuery.setEndDate(endDate);
    dataQuery.setDimensions(getDimensions());
    dataQuery.setMetrics(getMetrics());
    dataQuery.setIds(getIds());
    dataQuery.setSort(getSorting());
    dataQuery.setFilters(filters);

    return dataQuery;
}

private String getMetrics() {
    return createStringFromList(metrics);
}

private <E> String createStringFromList(List<E> items) {
    if (items.isEmpty()) {
        return null;
    }
    StringBuffer sb = new StringBuffer(50);
    for (E item : items) {
        if (!item.toString().startsWith("ga:")) {
            sb.append("ga:");
        }
        sb.append(item)
                .append(",");
    }
    return sb.deleteCharAt(sb.length() - 1).toString();
}

Imagine you have use the code : .metric(Metric.visits,Metric.visitors), the result of getMetrics() would be: “ga:visists,ga:visitors”.

Sorting and filtering

The DataQuery object also supports filtering and sorting. For sorting, the builder has some utility methods. Using these methods you add sorting do the provided Metric or Dimension. There are separate methods for ascending and descending sorting.

Filtering is more complex, for now there is not a lot of support in the builder. You can add your filter string manually. Some pointers for filtering are:

  • A filter consists of a property (Metric or Dimension), an comparator (>=) and a value (400)
  • You can group filters, “;” is and, “,” is or.
  • There are advanced options using regular expressions.

In the example I use the filter : .filters(“ga:visits>=500”). In the future I want a more advanced filter option that helps you create a good filter.

More stuff

There are some topics that I have not covered thorough enough. One of them is authentication problems. You might get an exception that you need to provide a Capcha result. I have not done this, so I need to investigate more here.

I already spoke about the filter, want to have more advanced support there as well.

I have not really covered the validation. Not all combinations of Dimensions and Metrics are correct. For now, you’ll get exceptions from Google. It would be nice to have a way that it is only possible to configure valid queries.

Sources

You can find the complete source of the example in my google code project.

http://code.google.com/p/gridshore/source/browse/#svn/trunk/GoogleAnalytics

Conclusion

Using the google export is not hard, the technology is fine. The problem lies in understanding what you need, what queries to perform and how to use that data. Stay tuned for actual implementations in the hippo cms and who knows, mabe the iPhone?

Using google analytics data in your application

4 thoughts on “Using google analytics data in your application

  • July 12, 2010 at 10:22 am
    Permalink

    hi,
    i am getting message “Caused by: com.google.gdata.util.InvalidEntryException: Illegal combination of dimensions and metrics”
    Can any one tell me why this error is coming and how i can read the traffic of a site using GA

  • May 20, 2010 at 7:58 pm
    Permalink

    It’s so simple (hiding the complexity) and great… it’s a breeze I was able to go first two steps… thxAlot!

  • June 8, 2009 at 8:41 am
    Permalink

    This is great stuff, one of our developers was trying to get around this.
    This post certainly helps.
    Looking forward for more.

    Anil
    twitter@waiblog

Comments are closed.