Skip to content

BigQuery

Important

To save data to BigQuery install garf-io with BigQuery support

pip install garf-io[bq]

bq writer allows you to save GarfReport to BigQuery table.

garf query.sql --source API_SOURCE \
  --output bq
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter()
writer.write(sample_report, 'query')

Parameters

project

By default reports are saved to GOOGLE_CLOUD_PROJECT. You can overwrite it with project parameter.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.project=PROJECT_ID
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(project="PROJECT_ID")
writer.write(sample_report, 'query')

dataset

By default reports are saved to garf dataset. You can overwrite it with dataset parameter.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.dataset=DATASET
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(dataset="DATASET")
writer.write(sample_report, 'query')

location

By default reports are saved to US location. You can overwrite it with location parameter.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.location=LOCATION
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(location="LOCATION")
writer.write(sample_report, 'query')

write_disposition

By default reports overwrite any existing data. You can overwrite it with write_disposition parameter.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.write_disposition=WRITE_APPEND
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(write_disposition="WRITE_APPEND")
writer.write(sample_report, 'query')

time_partitioning_column

By default all reports are written into a single table. With time_partitioning_column you can partition your table by HOUR, DAY, MONTH or YEAR.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.time_partitioning_column=COLUMN
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(time_partitioning_column="COLUMN")
writer.write(sample_report, 'query')

time_partitioning_type

Type of time partitioning (DAY, HOUR, MONTH, YEAR).

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.time_partitioning_type=DAY
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(time_partitioning_type="DAY")
writer.write(sample_report, 'query')

time_partitioning_expiration_ms

Expiration of time partitioned tables in milliseconds.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.time_partitioning_expiration_ms=2592000000
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(time_partitioning_expiration_ms=2592000000)
writer.write(sample_report, 'query')

range_partitioning_column

Column to partition tables into ranges.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.range_partitioning_column=COLUMN
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(range_partitioning_column="COLUMN")
writer.write(sample_report, 'query')

range_partitioning_range

Range definition in start:end:interval format.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.range_partitioning_range=0:1000:10
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(range_partitioning_range="0:1000:10")
writer.write(sample_report, 'query')

clustering_columns

Column(s) to perform clustering of table. Can be provided as a comma-separated string.

garf query.sql --source API_SOURCE \
  --output bq \
  --bq.clustering_columns=col1,col2
from garf.core import report
from garf.io.writers import bigquery_writer

# Create example report
sample_report = report.GarfReport(results=[[1]], column_names=['one'])

writer = bigquery_writer.BigQueryWriter(clustering_columns="col1,col2")
writer.write(sample_report, 'query')