Skip to content
Stijn Debrouwere edited this page May 18, 2015 · 10 revisions

This page explores all of the various facets of Google Analytics queries: metrics, dimensions, filters, segments, sorting, date ranges and report granularity. (Python API reference documentation. is also available.) But before we can query, we'll need to figure out which profile we'll want to request data from.

Picking a profile

Once you've authenticated, you'll have access to one or more accounts. Each account can have multiple web properties (each web property has its own tracking code). Finally, each web property can have one or more profiles, also known as views in some places in the Google Analytics GUI.

You can browse through your accounts, webproperties and profiles in the Google Analytics web interface, but you can also explore your account in a Python REPL:

>>> accounts = ga.authenticate()
>>> accounts
[<googleanalytics.account.Account object: debrouwere.org (12933299)>,
 ...]
>>> account = accounts['debrouwere.org']
>>> account.webproperties
[<googleanalytics.account.WebProperty object: http://debrouwere.org (UA-12933299-1)>, 
 ...]
>>> webproperty = account.webproperties['http://debrouwere.org']
>>> webproperty.profiles
[<googleanalytics.account.Profile object: debrouwere.org (26206906)>]

The default profile is available under WebProperty#profile (vs. WebProperty#profiles). When working with the API, you will usually want to work with a profile that has no automatic filters applied to it – or as few as possible. Then, just add whatever filters you'd like to your query.

It is also possible to navigate to a profile during authentication:

profile = ga.authenticate(
    account='debrouwere.org',
    webproperty='http://debrouwere.org',
    profile='debrouwere.org',
    )
profile = ga.authenticate(
    account='debrouwere.org',
    webproperty='http://debrouwere.org',
    )

If you don't specify a profile, the default profile will be used. (Note that there's always a default profile, but no similar concept of a default account or webproperty exists. You will always have to specify an account and a web property.)

Exploring metrics and dimensions

Metrics and dimensions can be specified using the internal ID, the slug or the (case-insensitive) human-readable name. These all work:

type example
id ga:goalCompletionsAll
slug goalCompletionsAll
case-insensitive slug goalcompletionsall
human-readable name Goal Completions
case-insensitive name GOAL completions
assert profile.core.metrics['pageviews'] == profile.core.metrics['ga:pageviews']

If you're not quite sure about the exact name of the metric or dimension you're interested in, take a look at the Dimensions and Metrics reference, the Query Explorer.

To see which metrics and dimensions are available to you in both the Core and the Real-Time API from Python:

print(profile.core.metrics)
print(profile.realtime.metrics)
print(profile.core.dimensions)
print(profile.realtime.dimensions)

Or just take a guess:

>>> metrics['goal completion']
KeyError: 'Cannot find goal completion among the available type. Did you mean: <googleanalytics.columns.Core object: Metric, Goal Completions (ga:goalCompletionsAll)>, <googleanalytics.columns.Core object: Metric, Goal 1 Completions (ga:goal1Completions)>, ...'

Date ranges and report granularity (hourly, daily, weekly, monthly, yearly)

Date ranges can be specified with Python date objects or date strings.

from datetime import date
query.range('2015-01-01', '2015-01-31')
query.range(date(2015, 1, 1), date(2015, 1, 31))

You can specify an explicit start and stop date, or just a single date together with the number of days or months to count forward or backwards from that point.

query.range('2015-01-01', days=31)
query.range('2015-01-31', days=-31)
query.range(date(2015, 1, 1), months=1)

If you specify only a number of days of months, the date range will end yesterday.

query.range(days=-7)

If you specify only a start date, we will query for just that day -- end date will be assumed to be the same as start date.

query.range('2015-01-01')

By default, Google will return just one big total for each metric. If you'd like hourly, daily, weekly, monthly or yearly results you need to add some sort of time dimension. The easiest way to do this is through eponymous convenience methods:

query.hourly('2015-01-01', '2015-01-01')
query.yearly('2010-01-01', '2015-12-31')

These methods work just like a regular Query#range specification, but they add the appropriate time dimension so that you get back a separate result for each hour, day, week... whatever the granularity you asked for.

Note: depending on the size of your audience, Google Analytics' data can be a couple of hours behind. In addition, when comparing today's data to previous days, you won't have 24 hours of data in any case, so make sure you don't make unfair comparisons between today and previous days, and when doing programmatic roll-ups, don't schedule these for midnight – 6 AM is a better bet.

Sorting and limits

# return a top 10
query.sort('pageviews', descending=True).limit(10)
# return the next 10 (similar to how LIMIT works in SQL)
query.limit(10, 10)

Note: Google Analytics uses 1-indexed rows. The first row is not row 0 but row 1.

Filters and segments

Filters are applied at the event level (each individual pageview) whereas segments are applied later in the querying process and help you limit the data to only a certain kind of user or visit.

# limit pageviews count to just a part of your site
query.filter(pagepathlevel1='/stories')
# don't include traffic to the about page
query.filter(pagepath__ne='/about')
# return only information for mobile users
query \
    .metrics('pageviews', 'session duration') \
    .segment('mobile traffic')

Precision and sampled reports

For queries that should run faster, you may specify a lower precision, and for those that need to be more precise, a higher precision:

# faster queries
query.range('2014-01-01', '2014-01-31', precision=0)
query.range('2014-01-01', '2014-01-31', precision='FASTER')
# queries with the default level of precision (usually what you want)
query.range('2014-01-01', '2014-01-31')
query.range('2014-01-01', '2014-01-31', precision=1)
query.range('2014-01-01', '2014-01-31', precision='DEFAULT')
# queries that are more precise
query.range('2014-01-01', '2014-01-31', precision=2)
query.range('2014-01-01', '2014-01-31', precision='HIGHER_PRECISION')        

Unless you are absolutely sure you want or need this, don't bother setting a precision. The default precision is usually plenty fast, and (usually only marginally) higher precision is almost never worth the huge increase in query time.

Querying closer to the metal

In some cases, it can be useful to construct a query directly, without resorting to the convenience methods on the Query object. Lower-level access is provided through the query.set method -- you can pass set either a key and value, a dictionary with key-value pairs or you can pass keyword arguments. These will then be added to the raw query dictionary.

query = profile.core.query() \
    .set(metrics=['ga:pageviews']) \
    .set(dimensions=['ga:yearMonth']) \
    .set('start_date', '2014-07-01') \
    .set({'end_date': '2014-07-05'})

You can always check what the raw query is going to be with the build method on queries. You can also access the raw query as well as raw report data in query.raw and report.raw respectively.

print(query.build())
from pprint import pprint
pprint(query.raw)
report = query.get()
pprint(report.raw)

If you'd like to just use the simplified OAuth2 functionality, that's possible too, using Google's service interface on the Account object.

accounts = ga.authenticate()
raw_query = {
    'ids': 'ga:26206906', 
    'metrics': ['ga:pageviews'], 
    'dimensions': ['ga:yearMonth'], 
    'start_date': '2014-07-01', 
    'end_date': '2014-07-05', 
}
accounts[0].service.data().ga().get(raw_query).execute()

You'll find more information about this interface in Google's own Analytics documentation for Python.

Using the Real Time Reporting API

The Real Time Reporting API is currently in closed beta. However, you can request access by filling out a short form and will generally be granted access to the API within 24 hours.

The Real Time API is very similar to the Core API:

import googleanalytics
accounts = googleanalytics.authenticate(identity='me')
profile = accounts[0].webproperties[0].profiles[0]
# Core API
profile.core.query('pageviews').daily('3daysAgo').values
# Real Time API
profile.realtime.query('pageviews', 'minutes ago').values

The only caveat is that not all of the metrics and dimensions you're used to from the Core are supported. Take a look at the Real Time Reporting API reference documentation to find out more, or check out all available columns interactively through Profile#realtime.metrics and Profile#realtime.dimensions in Python.

Clone this wiki locally