Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

BigQuery : add sample for writing query results to a destination table. #1101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 19, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions 33 bigquery/cloud-client/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,30 @@ def query_standard_sql(query):
print(row)


def query_destination_table(query, dest_dataset_id, dest_table_id):
client = bigquery.Client()
query_job = client.run_async_query(str(uuid.uuid4()), query)

# Allow for query results larger than the maximum response size.
query_job.allow_large_results = True

# When large results are allowed, a destination table must be set.
dest_dataset = client.dataset(dest_dataset_id)
dest_table = dest_dataset.table(dest_table_id)
query_job.destination = dest_table

# Allow the results table to be overwritten.
query_job.write_disposition = 'WRITE_TRUNCATE'

query_job.begin()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't .result() call .begin()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realize that. I can update the sample if so.

query_job.result() # Wait for job to complete.

# Verify that the results were written to the destination table.
dest_table.reload() # Get the table metadata, such as the schema.
for row in dest_table.fetch_data():
print(row)


if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
Expand All @@ -72,10 +96,19 @@ def query_standard_sql(query):
'--use_standard_sql',
action='store_true',
help='Use standard SQL syntax.')
parser.add_argument(
'--destination_table',
type=str,
help=(
'Destination table to use for results. '
'Example: my_dataset.my_table'))

args = parser.parse_args()

if args.use_standard_sql:
query_standard_sql(args.query)
elif args.destination_table:
dataset, table = args.destination_table.split('.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a simple regular expression to validate that the destination table name is in the correct format, i.e. dataset_name.table_name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this being a sample, it might be a better idea to keep unnecessary complexity down. So I'll leave that to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep the complexity low.

query_destination_table(args.query, dataset, table)
else:
query(args.query)
20 changes: 20 additions & 0 deletions 20 bigquery/cloud-client/query_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@
import query


DATASET_ID = 'test_dataset'
TABLE_ID = 'test_destination_table'


def test_query(capsys):
# Query only outputs the first 10 rows, sort results to avoid randomness
query_string = '''#standardSQL
Expand Down Expand Up @@ -44,3 +48,19 @@ def test_query_standard_sql(capsys):
out, _ = capsys.readouterr()

assert 'antonyandcleopatra' in out


def test_query_destination_table(capsys):
# Query only outputs the first 10 rows, sort results to avoid randomness
query_string = '''#standardSQL
SELECT corpus
FROM `publicdata.samples.shakespeare`
GROUP BY corpus
ORDER BY corpus
LIMIT 10;'''

query.query_destination_table(query_string, DATASET_ID, TABLE_ID)

out, _ = capsys.readouterr()

assert 'antonyandcleopatra' in out
Morty Proxy This is a proxified and sanitized view of the page, visit original site.