Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit e049810

Browse filesBrowse files
tswastwaltaskew
authored andcommitted
fix!: remove out-of-date BigQuery ML protocol buffers (googleapis#1178)
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776) fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) feat!: destination tables are no-longer removed by `create_job` (googleapis#891) feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972) fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972) feat!: mark the package as type-checked (googleapis#1058) feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061) feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967) fix: improve type annotations for mypy validation (googleapis#1081) feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117) docs: Add migration guide from version 2.x to 3.x (googleapis#1027) Release-As: 3.0.0
1 parent 585239f commit e049810
Copy full SHA for e049810

File tree

Expand file treeCollapse file tree

274 files changed

+5282
-2797
lines changed
Filter options

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Dismiss banner
Expand file treeCollapse file tree

274 files changed

+5282
-2797
lines changed

‎.coveragerc

Copy file name to clipboardExpand all lines: .coveragerc
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ fail_under = 100
66
show_missing = True
77
omit =
88
google/cloud/bigquery/__init__.py
9+
google/cloud/bigquery_v2/* # Legacy proto-based types.
910
exclude_lines =
1011
# Re-enable the standard pragma
1112
pragma: NO COVER

‎README.rst

Copy file name to clipboardExpand all lines: README.rst
+1-4Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Python Client for Google BigQuery
22
=================================
33

4-
|GA| |pypi| |versions|
4+
|GA| |pypi| |versions|
55

66
Querying massive datasets can be time consuming and expensive without the
77
right hardware and infrastructure. Google `BigQuery`_ solves this problem by
@@ -140,6 +140,3 @@ In this example all tracing data will be published to the Google
140140

141141
.. _OpenTelemetry documentation: https://opentelemetry-python.readthedocs.io
142142
.. _Cloud Trace: https://cloud.google.com/trace
143-
144-
145-

‎UPGRADING.md

Copy file name to clipboardExpand all lines: UPGRADING.md
+185-1Lines changed: 185 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,190 @@ See the License for the specific language governing permissions and
1111
limitations under the License.
1212
-->
1313

14+
# 3.0.0 Migration Guide
15+
16+
## New Required Dependencies
17+
18+
Some of the previously optional dependencies are now *required* in `3.x` versions of the
19+
library, namely
20+
[google-cloud-bigquery-storage](https://pypi.org/project/google-cloud-bigquery-storage/)
21+
(minimum version `2.0.0`) and [pyarrow](https://pypi.org/project/pyarrow/) (minimum
22+
version `3.0.0`).
23+
24+
The behavior of some of the package "extras" has thus also changed:
25+
* The `pandas` extra now requires the [db-types](https://pypi.org/project/db-dtypes/)
26+
package.
27+
* The `bqstorage` extra has been preserved for comaptibility reasons, but it is now a
28+
no-op and should be omitted when installing the BigQuery client library.
29+
30+
**Before:**
31+
```
32+
$ pip install google-cloud-bigquery[bqstorage]
33+
```
34+
35+
**After:**
36+
```
37+
$ pip install google-cloud-bigquery
38+
```
39+
40+
* The `bignumeric_type` extra has been removed, as `BIGNUMERIC` type is now
41+
automatically supported. That extra should thus not be used.
42+
43+
**Before:**
44+
```
45+
$ pip install google-cloud-bigquery[bignumeric_type]
46+
```
47+
48+
**After:**
49+
```
50+
$ pip install google-cloud-bigquery
51+
```
52+
53+
54+
## Type Annotations
55+
56+
The library is now type-annotated and declares itself as such. If you use a static
57+
type checker such as `mypy`, you might start getting errors in places where
58+
`google-cloud-bigquery` package is used.
59+
60+
It is recommended to update your code and/or type annotations to fix these errors, but
61+
if this is not feasible in the short term, you can temporarily ignore type annotations
62+
in `google-cloud-bigquery`, for example by using a special `# type: ignore` comment:
63+
64+
```py
65+
from google.cloud import bigquery # type: ignore
66+
```
67+
68+
But again, this is only recommended as a possible short-term workaround if immediately
69+
fixing the type check errors in your project is not feasible.
70+
71+
## Re-organized Types
72+
73+
The auto-generated parts of the library has been removed, and proto-based types formerly
74+
found in `google.cloud.bigquery_v2` have been replaced by the new implementation (but
75+
see the [section](#legacy-types) below).
76+
77+
For example, the standard SQL data types should new be imported from a new location:
78+
79+
**Before:**
80+
```py
81+
from google.cloud.bigquery_v2 import StandardSqlDataType
82+
from google.cloud.bigquery_v2.types import StandardSqlField
83+
from google.cloud.bigquery_v2.types.standard_sql import StandardSqlStructType
84+
```
85+
86+
**After:**
87+
```py
88+
from google.cloud.bigquery import StandardSqlDataType
89+
from google.cloud.bigquery.standard_sql import StandardSqlField
90+
from google.cloud.bigquery.standard_sql import StandardSqlStructType
91+
```
92+
93+
The `TypeKind` enum defining all possible SQL types for schema fields has been renamed
94+
and is not nested anymore under `StandardSqlDataType`:
95+
96+
97+
**Before:**
98+
```py
99+
from google.cloud.bigquery_v2 import StandardSqlDataType
100+
101+
if field_type == StandardSqlDataType.TypeKind.STRING:
102+
...
103+
```
104+
105+
**After:**
106+
```py
107+
108+
from google.cloud.bigquery import StandardSqlTypeNames
109+
110+
if field_type == StandardSqlTypeNames.STRING:
111+
...
112+
```
113+
114+
115+
## Issuing queries with `Client.create_job` preserves destination table
116+
117+
The `Client.create_job` method no longer removes the destination table from a
118+
query job's configuration. Destination table for the query can thus be
119+
explicitly defined by the user.
120+
121+
122+
## Changes to data types when reading a pandas DataFrame
123+
124+
The default dtypes returned by the `to_dataframe` method have changed.
125+
126+
* Now, the BigQuery `BOOLEAN` data type maps to the pandas `boolean` dtype.
127+
Previously, this mapped to the pandas `bool` dtype when the column did not
128+
contain `NULL` values and the pandas `object` dtype when `NULL` values are
129+
present.
130+
* Now, the BigQuery `INT64` data type maps to the pandas `Int64` dtype.
131+
Previously, this mapped to the pandas `int64` dtype when the column did not
132+
contain `NULL` values and the pandas `float64` dtype when `NULL` values are
133+
present.
134+
* Now, the BigQuery `DATE` data type maps to the pandas `dbdate` dtype, which
135+
is provided by the
136+
[db-dtypes](https://googleapis.dev/python/db-dtypes/latest/index.html)
137+
package. If any date value is outside of the range of
138+
[pandas.Timestamp.min](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.min.html)
139+
(1677-09-22) and
140+
[pandas.Timestamp.max](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.max.html)
141+
(2262-04-11), the data type maps to the pandas `object` dtype. The
142+
`date_as_object` parameter has been removed.
143+
* Now, the BigQuery `TIME` data type maps to the pandas `dbtime` dtype, which
144+
is provided by the
145+
[db-dtypes](https://googleapis.dev/python/db-dtypes/latest/index.html)
146+
package.
147+
148+
149+
## Changes to data types loading a pandas DataFrame
150+
151+
In the absence of schema information, pandas columns with naive
152+
`datetime64[ns]` values, i.e. without timezone information, are recognized and
153+
loaded using the `DATETIME` type. On the other hand, for columns with
154+
timezone-aware `datetime64[ns, UTC]` values, the `TIMESTAMP` type is continued
155+
to be used.
156+
157+
## Changes to `Model`, `Client.get_model`, `Client.update_model`, and `Client.list_models`
158+
159+
The types of several `Model` properties have been changed.
160+
161+
- `Model.feature_columns` now returns a sequence of `google.cloud.bigquery.standard_sql.StandardSqlField`.
162+
- `Model.label_columns` now returns a sequence of `google.cloud.bigquery.standard_sql.StandardSqlField`.
163+
- `Model.model_type` now returns a string.
164+
- `Model.training_runs` now returns a sequence of dictionaries, as recieved from the [BigQuery REST API](https://cloud.google.com/bigquery/docs/reference/rest/v2/models#Model.FIELDS.training_runs).
165+
166+
<a name="legacy-protobuf-types"></a>
167+
## Legacy Protocol Buffers Types
168+
169+
For compatibility reasons, the legacy proto-based types still exists as static code
170+
and can be imported:
171+
172+
```py
173+
from google.cloud.bigquery_v2 import Model # a sublcass of proto.Message
174+
```
175+
176+
Mind, however, that importing them will issue a warning, because aside from
177+
being importable, these types **are not maintained anymore**. They may differ
178+
both from the types in `google.cloud.bigquery`, and from the types supported on
179+
the backend.
180+
181+
### Maintaining compatibility with `google-cloud-bigquery` version 2.0
182+
183+
If you maintain a library or system that needs to support both
184+
`google-cloud-bigquery` version 2.x and 3.x, it is recommended that you detect
185+
when version 2.x is in use and convert properties that use the legacy protocol
186+
buffer types, such as `Model.training_runs`, into the types used in 3.x.
187+
188+
Call the [`to_dict`
189+
method](https://proto-plus-python.readthedocs.io/en/latest/reference/message.html#proto.message.Message.to_dict)
190+
on the protocol buffers objects to get a JSON-compatible dictionary.
191+
192+
```py
193+
from google.cloud.bigquery_v2 import Model
194+
195+
training_run: Model.TrainingRun = ...
196+
training_run_dict = training_run.to_dict()
197+
```
14198

15199
# 2.0.0 Migration Guide
16200

@@ -56,4 +240,4 @@ distance_type = enums.Model.DistanceType.COSINE
56240
from google.cloud.bigquery_v2 import types
57241

58242
distance_type = types.Model.DistanceType.COSINE
59-
```
243+
```

‎docs/bigquery/legacy_proto_types.rst

Copy file name to clipboard
+14Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
Legacy proto-based Types for Google Cloud Bigquery v2 API
2+
=========================================================
3+
4+
.. warning::
5+
These types are provided for backward compatibility only, and are not maintained
6+
anymore. They might also differ from the types uspported on the backend. It is
7+
therefore strongly advised to migrate to the types found in :doc:`standard_sql`.
8+
9+
Also see the :doc:`3.0.0 Migration Guide<../UPGRADING>` for more information.
10+
11+
.. automodule:: google.cloud.bigquery_v2.types
12+
:members:
13+
:undoc-members:
14+
:show-inheritance:
+1-1Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Types for Google Cloud Bigquery v2 API
22
======================================
33

4-
.. automodule:: google.cloud.bigquery_v2.types
4+
.. automodule:: google.cloud.bigquery.standard_sql
55
:members:
66
:undoc-members:
77
:show-inheritance:

‎docs/conf.py

Copy file name to clipboardExpand all lines: docs/conf.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,12 +109,12 @@
109109
# List of patterns, relative to source directory, that match files and
110110
# directories to ignore when looking for source files.
111111
exclude_patterns = [
112+
"google/cloud/bigquery_v2/**", # Legacy proto-based types.
112113
"_build",
113114
"**/.nox/**/*",
114115
"samples/AUTHORING_GUIDE.md",
115116
"samples/CONTRIBUTING.md",
116117
"samples/snippets/README.rst",
117-
"bigquery_v2/services.rst", # generated by the code generator
118118
]
119119

120120
# The reST default role (used for this markup: `text`) to use for all

‎docs/index.rst

Copy file name to clipboardExpand all lines: docs/index.rst
+2-1Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@ API Reference
3030
Migration Guide
3131
---------------
3232

33-
See the guide below for instructions on migrating to the 2.x release of this library.
33+
See the guides below for instructions on migrating from older to newer *major* releases
34+
of this library (from ``1.x`` to ``2.x``, or from ``2.x`` to ``3.x``).
3435

3536
.. toctree::
3637
:maxdepth: 2

‎docs/reference.rst

Copy file name to clipboardExpand all lines: docs/reference.rst
+17-2Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -202,9 +202,24 @@ Encryption Configuration
202202
Additional Types
203203
================
204204

205-
Protocol buffer classes for working with the Models API.
205+
Helper SQL type classes.
206206

207207
.. toctree::
208208
:maxdepth: 2
209209

210-
bigquery_v2/types
210+
bigquery/standard_sql
211+
212+
213+
Legacy proto-based Types (deprecated)
214+
=====================================
215+
216+
The legacy type classes based on protocol buffers.
217+
218+
.. deprecated:: 3.0.0
219+
These types are provided for backward compatibility only, and are not maintained
220+
anymore.
221+
222+
.. toctree::
223+
:maxdepth: 2
224+
225+
bigquery/legacy_proto_types

‎docs/snippets.py

Copy file name to clipboardExpand all lines: docs/snippets.py
-4Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,6 @@
3030
import pandas
3131
except (ImportError, AttributeError):
3232
pandas = None
33-
try:
34-
import pyarrow
35-
except (ImportError, AttributeError):
36-
pyarrow = None
3733

3834
from google.api_core.exceptions import InternalServerError
3935
from google.api_core.exceptions import ServiceUnavailable

‎docs/usage/pandas.rst

Copy file name to clipboardExpand all lines: docs/usage/pandas.rst
+35-3Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,12 @@ First, ensure that the :mod:`pandas` library is installed by running:
1414
1515
pip install --upgrade pandas
1616
17-
Alternatively, you can install the BigQuery python client library with
17+
Alternatively, you can install the BigQuery Python client library with
1818
:mod:`pandas` by running:
1919

2020
.. code-block:: bash
2121
22-
pip install --upgrade google-cloud-bigquery[pandas]
22+
pip install --upgrade 'google-cloud-bigquery[pandas]'
2323
2424
To retrieve query results as a :class:`pandas.DataFrame`:
2525

@@ -37,6 +37,38 @@ To retrieve table rows as a :class:`pandas.DataFrame`:
3737
:start-after: [START bigquery_list_rows_dataframe]
3838
:end-before: [END bigquery_list_rows_dataframe]
3939

40+
The following data types are used when creating a pandas DataFrame.
41+
42+
.. list-table:: Pandas Data Type Mapping
43+
:header-rows: 1
44+
45+
* - BigQuery
46+
- pandas
47+
- Notes
48+
* - BOOL
49+
- boolean
50+
-
51+
* - DATETIME
52+
- datetime64[ns], object
53+
- The object dtype is used when there are values not representable in a
54+
pandas nanosecond-precision timestamp.
55+
* - DATE
56+
- dbdate, object
57+
- The object dtype is used when there are values not representable in a
58+
pandas nanosecond-precision timestamp.
59+
60+
Requires the ``db-dtypes`` package. See the `db-dtypes usage guide
61+
<https://googleapis.dev/python/db-dtypes/latest/usage.html>`_
62+
* - FLOAT64
63+
- float64
64+
-
65+
* - INT64
66+
- Int64
67+
-
68+
* - TIME
69+
- dbtime
70+
- Requires the ``db-dtypes`` package. See the `db-dtypes usage guide
71+
<https://googleapis.dev/python/db-dtypes/latest/usage.html>`_
4072

4173
Retrieve BigQuery GEOGRAPHY data as a GeoPandas GeoDataFrame
4274
------------------------------------------------------------
@@ -60,7 +92,7 @@ As of version 1.3.0, you can use the
6092
to load data from a :class:`pandas.DataFrame` to a
6193
:class:`~google.cloud.bigquery.table.Table`. To use this function, in addition
6294
to :mod:`pandas`, you will need to install the :mod:`pyarrow` library. You can
63-
install the BigQuery python client library with :mod:`pandas` and
95+
install the BigQuery Python client library with :mod:`pandas` and
6496
:mod:`pyarrow` by running:
6597

6698
.. code-block:: bash

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.