-
Notifications
You must be signed in to change notification settings - Fork 6.6k
BigQuery : add sample for writing query results to a destination table. #1101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -63,6 +63,30 @@ def query_standard_sql(query): | |
print(row) | ||
|
||
|
||
def query_destination_table(query, dest_dataset_id, dest_table_id): | ||
client = bigquery.Client() | ||
query_job = client.run_async_query(str(uuid.uuid4()), query) | ||
|
||
# Allow for query results larger than the maximum response size. | ||
query_job.allow_large_results = True | ||
|
||
# When large results are allowed, a destination table must be set. | ||
dest_dataset = client.dataset(dest_dataset_id) | ||
dest_table = dest_dataset.table(dest_table_id) | ||
query_job.destination = dest_table | ||
|
||
# Allow the results table to be overwritten. | ||
query_job.write_disposition = 'WRITE_TRUNCATE' | ||
|
||
query_job.begin() | ||
query_job.result() # Wait for job to complete. | ||
|
||
# Verify that the results were written to the destination table. | ||
dest_table.reload() # Get the table metadata, such as the schema. | ||
for row in dest_table.fetch_data(): | ||
print(row) | ||
|
||
|
||
if __name__ == '__main__': | ||
parser = argparse.ArgumentParser( | ||
description=__doc__, | ||
|
@@ -72,10 +96,19 @@ def query_standard_sql(query): | |
'--use_standard_sql', | ||
action='store_true', | ||
help='Use standard SQL syntax.') | ||
parser.add_argument( | ||
'--destination_table', | ||
type=str, | ||
help=( | ||
'Destination table to use for results. ' | ||
'Example: my_dataset.my_table')) | ||
|
||
args = parser.parse_args() | ||
|
||
if args.use_standard_sql: | ||
query_standard_sql(args.query) | ||
elif args.destination_table: | ||
dataset, table = args.destination_table.split('.') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could use a simple regular expression to validate that the destination table name is in the correct format, i.e. dataset_name.table_name There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. However, this being a sample, it might be a better idea to keep unnecessary complexity down. So I'll leave that to you There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd rather keep the complexity low. |
||
query_destination_table(args.query, dataset, table) | ||
else: | ||
query(args.query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't .result() call .begin()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't realize that. I can update the sample if so.