DoiT Cloud Intelligence™

How to Copy Data Inside BigQuery

By Joshua FoxJun 15, 20202 min read
How to Copy Data Inside BigQuery

Copying data between your own tables in BigQuery has become much easier in recent years. But there are still many ways to do it, each with different limitations that can get confusing. I’ll describe each, with pluses and minuses.

Copying across data regions used to be complicated.

Copy through Storage

You could copy from BigQuery to Cloud Storage, then back again to BigQuery in the other region. This not only has that additional complicating step, but it can get expensive too. You also had to orchestrate it yourself, for example, with Composer for scheduling, buffering, and parallelization.

Dataset Copying

Recently, the Dataset Copying feature has made this much easier. The feature works across regions and is free, except for network costs. Scheduling and related features are built-in.

Intra-region copying with `bq` _, the Job API, or Copy Tables_

Copying inside a region has always been easier, and best of all, free. Within the region, you can run bq cpfrom the command line, code against the Job API, or “Copy Tables” in the Cloud Console lets you do this as well.

Scheduled Queries

The Scheduled Queries feature is another way to copy inside a region. Though not free, it is quite flexible, allowing you to write any SELECT statement for insertion to a target table. And as the name suggests, it has scheduling built-in.

Of the approaches I mentioned, only bq cpand Scheduled Queries support one common use case, daily incremental backup of an ingestion-time partitioned table. Only these can copy just the last day’s partition and preserve this as a partition in the target table.

Here are your choices in table form: