From 01fe370478ed547a56f6685246400f7afd953097 Mon Sep 17 00:00:00 2001 From: Tim Swast Date: Wed, 24 Mar 2021 16:56:52 -0500 Subject: [PATCH 1/3] docs: build documentation with Sphinx --- CONTRIBUTING.md | 28 ----- README.rst | 127 ++++++++++++++++------- docs/README.rst | 223 ++++++++++++++++++++++++++++++++++++++++ docs/changelog.md | 1 + docs/index.rst | 20 ++++ docs/pybigquery/api.rst | 7 ++ 6 files changed, 340 insertions(+), 66 deletions(-) delete mode 100644 CONTRIBUTING.md create mode 100644 docs/README.rst create mode 120000 docs/changelog.md create mode 100644 docs/index.rst create mode 100644 docs/pybigquery/api.rst diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index 6272489d..00000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,28 +0,0 @@ -# How to Contribute - -We'd love to accept your patches and contributions to this project. There are -just a few small guidelines you need to follow. - -## Contributor License Agreement - -Contributions to this project must be accompanied by a Contributor License -Agreement. You (or your employer) retain the copyright to your contribution; -this simply gives us permission to use and redistribute your contributions as -part of the project. Head over to to see -your current agreements on file or to sign a new one. - -You generally only need to submit a CLA once, so if you've already submitted one -(even if it was for a different project), you probably don't need to do it -again. - -## Code Reviews - -All submissions, including submissions by project members, require review. We -use GitHub pull requests for this purpose. Consult -[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more -information on using pull requests. - -## Community Guidelines - -This project follows [Google's Open Source Community -Guidelines](https://opensource.google/conduct/). diff --git a/README.rst b/README.rst index 05f0d4fb..f3114b75 100644 --- a/README.rst +++ b/README.rst @@ -1,12 +1,88 @@ -SQLAlchemy dialect and API client for BigQuery. +SQLAlchemy Dialect for BigQuery +=============================== +|beta| |pypi| |versions| -Usage -===== +`SQLALchemy Dialects`_ + +- `Dialect Documentation`_ +- `Product Documentation`_ + +.. |beta| image:: https://img.shields.io/badge/support-beta-orange.svg + :target: https://github.com/googleapis/google-cloud-python/blob/master/README.rst#beta-support +.. |pypi| image:: https://img.shields.io/pypi/v/pybigquery.svg + :target: https://pypi.org/project/pybigquery/ +.. |versions| image:: https://img.shields.io/pypi/pyversions/pybigquery.svg + :target: https://pypi.org/project/pybigquery/ +.. _SQLAlchemy Dialects: https://docs.sqlalchemy.org/en/14/dialects/ +.. _Dialect Documentation: https://googleapis.dev/python/pybigquery/latest +.. _Product Documentation: https://cloud.google.com/bigquery/docs/ + + +Quick Start +----------- + +In order to use this library, you first need to go through the following steps: + +1. `Select or create a Cloud Platform project.`_ +2. [Optional] `Enable billing for your project.`_ +3. `Enable the BigQuery Storage API.`_ +4. `Setup Authentication.`_ + +.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project +.. _Enable billing for your project.: https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project +.. _Enable the BigQuery Storage API.: https://console.cloud.google.com/apis/library/bigquery.googleapis.com +.. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html + +Installation +------------ + +Install this library in a `virtualenv`_ using pip. `virtualenv`_ is a tool to +create isolated Python environments. The basic problem it addresses is one of +dependencies and versions, and indirectly permissions. + +With `virtualenv`_, it's possible to install this library without needing system +install permissions, and without clashing with the installed system +dependencies. + +.. _`virtualenv`: https://virtualenv.pypa.io/en/latest/ + + +Supported Python Versions +^^^^^^^^^^^^^^^^^^^^^^^^^ +Python >= 3.6 + +Unsupported Python Versions +^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Python == 2.7, Python == 3.5. + + +Mac/Linux +^^^^^^^^^ + +.. code-block:: console + pip install virtualenv + virtualenv + source /bin/activate + /bin/pip install pybigquery + + +Windows +^^^^^^^ + +.. code-block:: console + + pip install virtualenv + virtualenv + \Scripts\activate + \Scripts\pip.exe install pybigquery + +Usage +----- SQLAlchemy -__________ +^^^^^^^^^^ .. code-block:: python @@ -18,7 +94,7 @@ __________ print(select([func.count('*')], from_obj=table).scalar()) API Client -__________ +^^^^^^^^^^ .. code-block:: python @@ -27,12 +103,12 @@ __________ print(api_client.dry_run_query(query=sqlstr).total_bytes_processed) Project -_______ +^^^^^^^ ``project`` in ``bigquery://project`` is used to instantiate BigQuery client with the specific project ID. To infer project from the environment, use ``bigquery://`` – without ``project`` Authentication -______________ +^^^^^^^^^^^^^^ Follow the `Google Cloud library guide `_ for authentication. Alternatively, you can provide the path to a service account JSON file in ``create_engine()``: @@ -42,7 +118,7 @@ Follow the `Google Cloud library guide `_. For situations like these, or for situations where you want the ``Client`` to have a `default_query_job_config `_, you can pass many arguments in the query of the connection string. @@ -132,7 +208,7 @@ Here are examples of all the supported arguments. Any not present are either for Creating tables -_______________ +^^^^^^^^^^^^^^^ To add metadata to a table: @@ -145,28 +221,3 @@ To add metadata to a column: .. code-block:: python Column('mycolumn', doc='my column description') - - -Requirements -============ - -Install using - -- ``pip install pybigquery`` - - -Testing -============ - -Load sample tables:: - - ./scripts/load_test_data.sh - -This will create a dataset ``test_pybigquery`` with tables named ``sample_one_row`` and ``sample``. - -Set up an environment and run tests:: - - pyvenv .env - source .env/bin/activate - pip install -r dev_requirements.txt - pytest diff --git a/docs/README.rst b/docs/README.rst new file mode 100644 index 00000000..f3114b75 --- /dev/null +++ b/docs/README.rst @@ -0,0 +1,223 @@ +SQLAlchemy Dialect for BigQuery +=============================== + +|beta| |pypi| |versions| + +`SQLALchemy Dialects`_ + +- `Dialect Documentation`_ +- `Product Documentation`_ + +.. |beta| image:: https://img.shields.io/badge/support-beta-orange.svg + :target: https://github.com/googleapis/google-cloud-python/blob/master/README.rst#beta-support +.. |pypi| image:: https://img.shields.io/pypi/v/pybigquery.svg + :target: https://pypi.org/project/pybigquery/ +.. |versions| image:: https://img.shields.io/pypi/pyversions/pybigquery.svg + :target: https://pypi.org/project/pybigquery/ +.. _SQLAlchemy Dialects: https://docs.sqlalchemy.org/en/14/dialects/ +.. _Dialect Documentation: https://googleapis.dev/python/pybigquery/latest +.. _Product Documentation: https://cloud.google.com/bigquery/docs/ + + +Quick Start +----------- + +In order to use this library, you first need to go through the following steps: + +1. `Select or create a Cloud Platform project.`_ +2. [Optional] `Enable billing for your project.`_ +3. `Enable the BigQuery Storage API.`_ +4. `Setup Authentication.`_ + +.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project +.. _Enable billing for your project.: https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project +.. _Enable the BigQuery Storage API.: https://console.cloud.google.com/apis/library/bigquery.googleapis.com +.. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html + +Installation +------------ + +Install this library in a `virtualenv`_ using pip. `virtualenv`_ is a tool to +create isolated Python environments. The basic problem it addresses is one of +dependencies and versions, and indirectly permissions. + +With `virtualenv`_, it's possible to install this library without needing system +install permissions, and without clashing with the installed system +dependencies. + +.. _`virtualenv`: https://virtualenv.pypa.io/en/latest/ + + +Supported Python Versions +^^^^^^^^^^^^^^^^^^^^^^^^^ +Python >= 3.6 + +Unsupported Python Versions +^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Python == 2.7, Python == 3.5. + + +Mac/Linux +^^^^^^^^^ + +.. code-block:: console + + pip install virtualenv + virtualenv + source /bin/activate + /bin/pip install pybigquery + + +Windows +^^^^^^^ + +.. code-block:: console + + pip install virtualenv + virtualenv + \Scripts\activate + \Scripts\pip.exe install pybigquery + +Usage +----- + +SQLAlchemy +^^^^^^^^^^ + +.. code-block:: python + + from sqlalchemy import * + from sqlalchemy.engine import create_engine + from sqlalchemy.schema import * + engine = create_engine('bigquery://project') + table = Table('dataset.table', MetaData(bind=engine), autoload=True) + print(select([func.count('*')], from_obj=table).scalar()) + +API Client +^^^^^^^^^^ + +.. code-block:: python + + from pybigquery.api import ApiClient + api_client = ApiClient() + print(api_client.dry_run_query(query=sqlstr).total_bytes_processed) + +Project +^^^^^^^ + +``project`` in ``bigquery://project`` is used to instantiate BigQuery client with the specific project ID. To infer project from the environment, use ``bigquery://`` – without ``project`` + +Authentication +^^^^^^^^^^^^^^ + +Follow the `Google Cloud library guide `_ for authentication. Alternatively, you can provide the path to a service account JSON file in ``create_engine()``: + +.. code-block:: python + + engine = create_engine('bigquery://', credentials_path='/path/to/keyfile.json') + + +Location +^^^^^^^^ + +To specify location of your datasets pass ``location`` to ``create_engine()``: + +.. code-block:: python + + engine = create_engine('bigquery://project', location="asia-northeast1") + + +Table names +^^^^^^^^^^^ + +To query tables from non-default projects or datasets, use the following format for the SQLAlchemy schema name: ``[project.]dataset``, e.g.: + +.. code-block:: python + + # If neither dataset nor project are the default + sample_table_1 = Table('natality', schema='bigquery-public-data.samples') + # If just dataset is not the default + sample_table_2 = Table('natality', schema='bigquery-public-data') + +Batch size +^^^^^^^^^^ + +By default, ``arraysize`` is set to ``5000``. ``arraysize`` is used to set the batch size for fetching results. To change it, pass ``arraysize`` to ``create_engine()``: + +.. code-block:: python + + engine = create_engine('bigquery://project', arraysize=1000) + + +Adding a Default Dataset +^^^^^^^^^^^^^^^^^^^^^^^^ + +If you want to have the ``Client`` use a default dataset, specify it as the "database" portion of the connection string. + +.. code-block:: python + + engine = create_engine('bigquery://project/dataset') + +When using a default dataset, don't include the dataset name in the table name, e.g.: + +.. code-block:: python + + table = Table('table_name') + +Note that specifying a default dataset doesn't restrict execution of queries to that particular dataset when using raw queries, e.g.: + +.. code-block:: python + + # Set default dataset to dataset_a + engine = create_engine('bigquery://project/dataset_a') + + # This will still execute and return rows from dataset_b + engine.execute('SELECT * FROM dataset_b.table').fetchall() + + +Connection String Parameters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are many situations where you can't call ``create_engine`` directly, such as when using tools like `Flask SQLAlchemy `_. For situations like these, or for situations where you want the ``Client`` to have a `default_query_job_config `_, you can pass many arguments in the query of the connection string. + +The ``credentials_path``, ``credentials_info``, ``location``, and ``arraysize`` parameters are used by this library, and the rest are used to create a `QueryJobConfig `_ + +Note that if you want to use query strings, it will be more reliable if you use three slashes, so ``'bigquery:///?a=b'`` will work reliably, but ``'bigquery://?a=b'`` might be interpreted as having a "database" of ``?a=b``, depending on the system being used to parse the connection string. + +Here are examples of all the supported arguments. Any not present are either for legacy sql (which isn't supported by this library), or are too complex and are not implemented. + +.. code-block:: python + + engine = create_engine( + 'bigquery://some-project/some-dataset' '?' + 'credentials_path=/some/path/to.json' '&' + 'location=some-location' '&' + 'arraysize=1000' '&' + 'clustering_fields=a,b,c' '&' + 'create_disposition=CREATE_IF_NEEDED' '&' + 'destination=different-project.different-dataset.table' '&' + 'destination_encryption_configuration=some-configuration' '&' + 'dry_run=true' '&' + 'labels=a:b,c:d' '&' + 'maximum_bytes_billed=1000' '&' + 'priority=INTERACTIVE' '&' + 'schema_update_options=ALLOW_FIELD_ADDITION,ALLOW_FIELD_RELAXATION' '&' + 'use_query_cache=true' '&' + 'write_disposition=WRITE_APPEND' + ) + + +Creating tables +^^^^^^^^^^^^^^^ + +To add metadata to a table: + +.. code-block:: python + + table = Table('mytable', ..., bigquery_description='my table description', bigquery_friendly_name='my table friendly name') + +To add metadata to a column: + +.. code-block:: python + + Column('mycolumn', doc='my column description') diff --git a/docs/changelog.md b/docs/changelog.md new file mode 120000 index 00000000..04c99a55 --- /dev/null +++ b/docs/changelog.md @@ -0,0 +1 @@ +../CHANGELOG.md \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst new file mode 100644 index 00000000..5de1b9aa --- /dev/null +++ b/docs/index.rst @@ -0,0 +1,20 @@ +.. include:: README.rst + +.. include:: multiprocessing.rst + +API Reference +------------- +.. toctree:: + :maxdepth: 2 + + pybigquery/api + +Changelog +--------- + +For a list of all ``pybigquery`` releases: + +.. toctree:: + :maxdepth: 2 + + changelog diff --git a/docs/pybigquery/api.rst b/docs/pybigquery/api.rst new file mode 100644 index 00000000..bc9923e5 --- /dev/null +++ b/docs/pybigquery/api.rst @@ -0,0 +1,7 @@ +PyBigQuery API +============== + +.. automodule:: pybigquery.api + :members: + :undoc-members: + :inherited-members: From 0108ca166e0e22e5abd9865b2de3c9c712c3a1ac Mon Sep 17 00:00:00 2001 From: Tim Swast Date: Thu, 25 Mar 2021 09:08:57 -0500 Subject: [PATCH 2/3] convert docs/README.rst to symlink --- docs/README.rst | 224 +----------------------------------------------- 1 file changed, 1 insertion(+), 223 deletions(-) mode change 100644 => 120000 docs/README.rst diff --git a/docs/README.rst b/docs/README.rst deleted file mode 100644 index f3114b75..00000000 --- a/docs/README.rst +++ /dev/null @@ -1,223 +0,0 @@ -SQLAlchemy Dialect for BigQuery -=============================== - -|beta| |pypi| |versions| - -`SQLALchemy Dialects`_ - -- `Dialect Documentation`_ -- `Product Documentation`_ - -.. |beta| image:: https://img.shields.io/badge/support-beta-orange.svg - :target: https://github.com/googleapis/google-cloud-python/blob/master/README.rst#beta-support -.. |pypi| image:: https://img.shields.io/pypi/v/pybigquery.svg - :target: https://pypi.org/project/pybigquery/ -.. |versions| image:: https://img.shields.io/pypi/pyversions/pybigquery.svg - :target: https://pypi.org/project/pybigquery/ -.. _SQLAlchemy Dialects: https://docs.sqlalchemy.org/en/14/dialects/ -.. _Dialect Documentation: https://googleapis.dev/python/pybigquery/latest -.. _Product Documentation: https://cloud.google.com/bigquery/docs/ - - -Quick Start ------------ - -In order to use this library, you first need to go through the following steps: - -1. `Select or create a Cloud Platform project.`_ -2. [Optional] `Enable billing for your project.`_ -3. `Enable the BigQuery Storage API.`_ -4. `Setup Authentication.`_ - -.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project -.. _Enable billing for your project.: https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project -.. _Enable the BigQuery Storage API.: https://console.cloud.google.com/apis/library/bigquery.googleapis.com -.. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html - -Installation ------------- - -Install this library in a `virtualenv`_ using pip. `virtualenv`_ is a tool to -create isolated Python environments. The basic problem it addresses is one of -dependencies and versions, and indirectly permissions. - -With `virtualenv`_, it's possible to install this library without needing system -install permissions, and without clashing with the installed system -dependencies. - -.. _`virtualenv`: https://virtualenv.pypa.io/en/latest/ - - -Supported Python Versions -^^^^^^^^^^^^^^^^^^^^^^^^^ -Python >= 3.6 - -Unsupported Python Versions -^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Python == 2.7, Python == 3.5. - - -Mac/Linux -^^^^^^^^^ - -.. code-block:: console - - pip install virtualenv - virtualenv - source /bin/activate - /bin/pip install pybigquery - - -Windows -^^^^^^^ - -.. code-block:: console - - pip install virtualenv - virtualenv - \Scripts\activate - \Scripts\pip.exe install pybigquery - -Usage ------ - -SQLAlchemy -^^^^^^^^^^ - -.. code-block:: python - - from sqlalchemy import * - from sqlalchemy.engine import create_engine - from sqlalchemy.schema import * - engine = create_engine('bigquery://project') - table = Table('dataset.table', MetaData(bind=engine), autoload=True) - print(select([func.count('*')], from_obj=table).scalar()) - -API Client -^^^^^^^^^^ - -.. code-block:: python - - from pybigquery.api import ApiClient - api_client = ApiClient() - print(api_client.dry_run_query(query=sqlstr).total_bytes_processed) - -Project -^^^^^^^ - -``project`` in ``bigquery://project`` is used to instantiate BigQuery client with the specific project ID. To infer project from the environment, use ``bigquery://`` – without ``project`` - -Authentication -^^^^^^^^^^^^^^ - -Follow the `Google Cloud library guide `_ for authentication. Alternatively, you can provide the path to a service account JSON file in ``create_engine()``: - -.. code-block:: python - - engine = create_engine('bigquery://', credentials_path='/path/to/keyfile.json') - - -Location -^^^^^^^^ - -To specify location of your datasets pass ``location`` to ``create_engine()``: - -.. code-block:: python - - engine = create_engine('bigquery://project', location="asia-northeast1") - - -Table names -^^^^^^^^^^^ - -To query tables from non-default projects or datasets, use the following format for the SQLAlchemy schema name: ``[project.]dataset``, e.g.: - -.. code-block:: python - - # If neither dataset nor project are the default - sample_table_1 = Table('natality', schema='bigquery-public-data.samples') - # If just dataset is not the default - sample_table_2 = Table('natality', schema='bigquery-public-data') - -Batch size -^^^^^^^^^^ - -By default, ``arraysize`` is set to ``5000``. ``arraysize`` is used to set the batch size for fetching results. To change it, pass ``arraysize`` to ``create_engine()``: - -.. code-block:: python - - engine = create_engine('bigquery://project', arraysize=1000) - - -Adding a Default Dataset -^^^^^^^^^^^^^^^^^^^^^^^^ - -If you want to have the ``Client`` use a default dataset, specify it as the "database" portion of the connection string. - -.. code-block:: python - - engine = create_engine('bigquery://project/dataset') - -When using a default dataset, don't include the dataset name in the table name, e.g.: - -.. code-block:: python - - table = Table('table_name') - -Note that specifying a default dataset doesn't restrict execution of queries to that particular dataset when using raw queries, e.g.: - -.. code-block:: python - - # Set default dataset to dataset_a - engine = create_engine('bigquery://project/dataset_a') - - # This will still execute and return rows from dataset_b - engine.execute('SELECT * FROM dataset_b.table').fetchall() - - -Connection String Parameters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -There are many situations where you can't call ``create_engine`` directly, such as when using tools like `Flask SQLAlchemy `_. For situations like these, or for situations where you want the ``Client`` to have a `default_query_job_config `_, you can pass many arguments in the query of the connection string. - -The ``credentials_path``, ``credentials_info``, ``location``, and ``arraysize`` parameters are used by this library, and the rest are used to create a `QueryJobConfig `_ - -Note that if you want to use query strings, it will be more reliable if you use three slashes, so ``'bigquery:///?a=b'`` will work reliably, but ``'bigquery://?a=b'`` might be interpreted as having a "database" of ``?a=b``, depending on the system being used to parse the connection string. - -Here are examples of all the supported arguments. Any not present are either for legacy sql (which isn't supported by this library), or are too complex and are not implemented. - -.. code-block:: python - - engine = create_engine( - 'bigquery://some-project/some-dataset' '?' - 'credentials_path=/some/path/to.json' '&' - 'location=some-location' '&' - 'arraysize=1000' '&' - 'clustering_fields=a,b,c' '&' - 'create_disposition=CREATE_IF_NEEDED' '&' - 'destination=different-project.different-dataset.table' '&' - 'destination_encryption_configuration=some-configuration' '&' - 'dry_run=true' '&' - 'labels=a:b,c:d' '&' - 'maximum_bytes_billed=1000' '&' - 'priority=INTERACTIVE' '&' - 'schema_update_options=ALLOW_FIELD_ADDITION,ALLOW_FIELD_RELAXATION' '&' - 'use_query_cache=true' '&' - 'write_disposition=WRITE_APPEND' - ) - - -Creating tables -^^^^^^^^^^^^^^^ - -To add metadata to a table: - -.. code-block:: python - - table = Table('mytable', ..., bigquery_description='my table description', bigquery_friendly_name='my table friendly name') - -To add metadata to a column: - -.. code-block:: python - - Column('mycolumn', doc='my column description') diff --git a/docs/README.rst b/docs/README.rst new file mode 120000 index 00000000..89a01069 --- /dev/null +++ b/docs/README.rst @@ -0,0 +1 @@ +../README.rst \ No newline at end of file From 885e824f2cbce1673eb8dd0c3ca3904edd937edb Mon Sep 17 00:00:00 2001 From: Tim Swast Date: Thu, 25 Mar 2021 10:04:05 -0500 Subject: [PATCH 3/3] clean up "unsupported Python versions" Co-authored-by: Anthonios Partheniou --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index f3114b75..1b3dc36f 100644 --- a/README.rst +++ b/README.rst @@ -54,7 +54,7 @@ Python >= 3.6 Unsupported Python Versions ^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Python == 2.7, Python == 3.5. +Python <= 3.5. Mac/Linux