ads-api-report-fetcher

Google Ads API Report Fetcher (gaarf)

npm Downloads npm PyPI Downloads PyPI GitHub Workflow CI

Table of content

Overview

Google Ads API Report Fetcher (gaarf) simplifies running Google Ads API Reports by separating logic of writing GAQL-like query from executing it and saving results.
The library allows you to define GAQL queries alongside aliases and custom extractors and specify where the results of such query should be stored. You can find example queries in examples folder. Based on such a query the library constructs the correct GAQL query, automatically extract all necessary fields from schema and transform them into a structure suitable for writing data.

Getting started

Google Ads API Report Fetcher has two versions - Python and Node.js. Please explore the relevant section to install and run the tool:

Both versions have similar command line arguments and query syntax.

Writing Queries

Google Ads API Report Fetcher provides an extended syntax on writing GAQL queries.
Please refer to How to write queries section to learn the query syntax.

Running gaarf

If gaarf is installed globally you can run it with the following command.

gaarf <queries> [options]

Options

The required positional arguments are a list of files or a text that contain Ads queries (GAQL). On *nix OSes you can use a glob pattern, e.g. ./ads-queries/**/*.sql. A file path can be not only a local path but also a GCS file (gs://).

If you run the tool on a *nix OS then your shell (like zsh/bash) probably supports file names expansion (see bash, zsh, 14.8 Filename Generation). And so it does expansion of glob pattern (file mask) into a list of files.

Currently only the NodeJS version supports wildcards natively (without glob expansion by the OS).

Instead of passing files you can pass commands (currently supported only by NodeJs versions):

Options:

Options specific for CSV writer:

Options specific for BigQuery writer:

Options specific for Console writer:

Options specific for SqlAlchemy writer (Python version only):

Options specific for Sheet writer (Python version only):

Options specific for JSON writer:

Query specific options

If your query contains macros, templates, or sql you need to pass --macro., --template., or --sql. CLI flags to gaarf. Learn more about each of those in How to write queries document:

By default gaarf expect a list of files, but with --input=console option you can provide query(-ies) directly from console:

gaarf "select customer.id from customer" 'SELECT campaign.id FROM campaign WHERE campaign.advertising_channel_type="SEARCH"' \
  --input=console --output=console

(NodeJS version only) You can use all types of outputs and might want to specify queries names (will be used for output files/tables), to do it prepend a query with “some_name:”. E.g.:

gaarf "customer:select customer.id from customer" --input=console --output=bq

For NodeJS version any of arguments can be specified via environment variable which name starts with “GAARF_” (e.g. GAARF_ACCOUNT).

Postprocessing

Once reports have been fetched you might use gaarf-bq or gaarf-sql (utilities that installed alongside with gaarf) to run queries in BigQuery or any other DB supposed by SqlAlchemy based on collected data in there. Essentially it’s a simple tool for executing queries from files, optionally creating tables for query results.

gaarf-bq <files> [options]
gaarf-sql <files> [options]

If your query contains macros, templates, or sql you need to pass --macro., --template., or --sql. CLI flags to to gaarf-bq or gaarf-sql. Lear more about each of those in How to write queries document:

The tool assumes that scripts you provide are DDL, i.e. contains statements like create table or create view.

In general it’s recommended to separate tables with data from Ads API and final tables/views created by your post-processing queries.

BigQuery specific options:

So it’s likely that your final tables will be in a separate dataset (or datasets). To allow the tool to create those datasets for you, make sure that macro for your datasets contains the word “dataset”. In that case gaarf-bq will check that dataset exists and create it if not.

For example:

CREATE OR REPLACE TABLE `{dst_dataset}.my_dashboard_table` AS
SELECT * FROM {ads_ds}.{campaign}

In this case gaarf-bq will check for existence of a dataset specified as ‘dst_dataset’ macro.

SqlAlchemy specific options [Python only]:

Connection string can be:

If the connection string relies on parameters, please export them:

export GAARF_USER=test
export GAARF_PASSWORD=test
export GAARF_DB_HOST=test
export GAARF_DB_PORT=12345
export GAARF_DB_NAME=test

Dynamic dates

Macro values can contain a special syntax for dynamic dates. If a macro value starts with :YYYY it will be processed as a dynamic expression to calculate a date based on the current date. The syntax is: :PATTERN-N, where N is a number of days/months/years and PATTERN is one of the following:

Example with providing values for macro start_date and end_date (that can be used in queries as date range) as a range from 1 month ago to yesterday:

gaarf google_ads_queries/*.sql --ads-config=google-ads.yaml \
  --account=1234567890 --output=bq \
  --macro.start_date=:YYYYMM-1 \
  --macro.end_date=:YYYYMMDD-1 \

So if today is 2022-07-29 then start_date will be ‘2022-06-29’ (minus one month) and end_date will be ‘2022-07-28’ (minus one day).

NOTE: dynamic date macro (:YYYY) can be defined as expressions as well (e.g. ${today()-1} instead of ‘:YYYYMMDD-1’) so they are two alternatives. But with expressions you won’t need to provide any arguments. With expressions we’ll have easier deployment (no arguments needed) but with dynamic date macro more flexibility if you need to provide different values (sometimes dynamic, sometimes fixed).

If you need to get the current date value in your query you can use a special macro date_iso that both versions support. In runtime you can assume that its value is always provided with the current date in the YYYYMMDD format. But you can override it via arguments if needed (e.g. --macro.date_iso=:YYYYMMDD-1).

Docker

You can run Gaarf as a Docker container.

export GAARF_ACCOUNT=123456
docker run  \
  -v $HOME/google-ads.yaml:/root/google-ads.yaml \
  ghcr.io/google/gaarf-py:latest \
  gaarf "SELECT customer.id AS account_id FROM customer" \
  --input=console --output=console \
  --account=$GAARF_ACCOUNT --ads_config=/root/google-ads.yaml

Build a container image

The repository contains sample Dockerfile’s for both versions (Node/Python) that you can use to build a Docker image.

If you cloned the repo then you can just run docker build (see below) inside it (in js/py folders) with the local Dockerfile. Otherwise you can just download Dockerfile into an empty folder:

curl -L https://raw.githubusercontent.com/google/ads-api-report-fetcher/main/js/Dockerfile > Dockerfile

Sample Dockerfile’s don’t depend on sources, they install gaarf from registries for each platform (npm and PyPi). To build an image with name ‘gaarf’ (the name is up to you but you’ll use it to run a container later) run the following command in a folder with Dockerfile:

sudo docker build . -t gaarf

Now you can run a container from this image.

Run a container

For running a container you’ll need the same parameters as you would provide for running it in command line (a list of ads scripts and a Ads API config and other parameters) and authentication for Google Cloud if you need to write data to BigQuery. The latter is achievable via declaring GOOGLE_APPLICATION_CREDENTIALS environment variable with a path to a service account key file.

You can either embed all them into the image on build or supply them in runtime when you run a container.

The aforementioned Dockerfile assumes the following:

So you can map your local files onto these paths so that Gaarf inside a container will find them. Or copy them before building, so they will be embedded into the image.

This is an example of running Gaarf (Node version) with mapping local files, assuming you have .gaarfrc and service_account.json in the current folder:

sudo docker run --mount type=bind,source="$(pwd)/.gaarfrc",target=/app/.gaarfrc \
  --mount type=bind,source="$(pwd)/ads-scripts",target=/app/ads-scripts \
  --mount type=bind,source="$(pwd)/service_account.json",target=/app/service_account.json \
  gaarf ./ads-scripts/*.sql

Here we mapped local .gaarfrc with with all parameters (alternatively you can pass them explicitly in command line), mapped a local service_account.json file with SA keys for authenticating in BigQuery, mapped a local folder “ads-scripts” with all Ads scripts that we’re passing by wildcard mask (it’ll be expanded to a list of files by your shell).

Gaarf Cloud Workflow

Inside gcp folder you can find code for deploying Gaarf to Google Cloud. There are the following components provided:

Please see the README there for all information.

Differences in Python and NodeJS versions

Query syntax and features

There are differences in supported features for queries.

NodeJS-only features:

Output BigQuery structure

There are differences in how tools process Ads queries. Python version sends queries to Ads API and parses the result. From the result it creates a BigQuery schema. That’s becasue tables in BQ are created only when a query retuned some data. NodeJS on the contrary parses queries and initializes BigQuery schema before execution. So that it creates BQ tables regardless of the results.

There are differences in BigQuery table structures as well. Python version creates one table per script. While NodeJS creates a table per script per customer and then creates a view to combine all customer tables. For example, you have a query campaign.sql. As a result you’ll get a querable source ‘campaign’ in BigQuery in any way. But for Python version it’ll be a table. For NodeJS it’ll be a view like create view dataset.campaign as select * from campaign_* when _TABLE_PREFIX in (cid1,cid2), where cid1, cid2 are customer ids you supplied.

From Ads API we can get arrays, structs and arrays of arrays or structs. In Python version all arrays will be degraded to string with “|” separator. In NodeJS version the result by default will be a repeated field (array) but can be degrated to string with separator via the bq.array-handling option. If values of an array from Ads API are also arrays or structs, they will be converted to JSON.

API support

Python version supports any API version (currently available). While as NodeJS parses query structure it supports only one particular version (it’s printed when you run gaarf without arguments).

Disclaimer

This is not an officially supported Google product.