Releases: googleapis/python-bigquery-dataframes
Releases · googleapis/python-bigquery-dataframes
v2.4.0
2.4.0 (2025-05-12)
Features
- Add "dayofyear" property for
dt
accessors (#1692) (9d4a59d) - Add
.dt.days
,.dt.seconds
,dt.microseconds
, anddt.total_seconds()
for timedelta series. (#1713) (2b3a45f) - Add
DatetimeIndex
class (#1719) (c3c830c) - Add
isocalendar()
for dt accessor" (#1717) (0479763) - Add bigframes.bigquery.json_value (#1697) (46a9c53)
- Add blob.exif function support (#1703) (3f79528)
- Add inplace arg support to sort methods (#1710) (d1ccb52)
- Improve error message in
Series.apply
for direct udfs (#1673) (1a658b2) - Publish bigframes blob(Multimodal) to preview (#1693) (e4c85ba)
- Support () operator between timedeltas (#1702) (edaac89)
- Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models (#1305) (b16740e)
- Support to_strip parameter for str.strip, str.lstrip and str.rstrip (#1705) (a84ee75)
Bug Fixes
- Fix dayofyear doc test (#1701) (9b777a0)
- Fix issues with chunked arrow data (#1700) (e3289b7)
- Rename columns with protected names such as
_TABLE_SUFFIX
into_gbq()
(#1691) (8ec6079)
Performance Improvements
- Defer query in
read_gbq
with wildcard tables (#1661) (5c125c9) - Rechunk result pages client side (#1680) (67d8760)
Dependencies
Documentation
- Add snippets for Matrix Factorization tutorials (#1630) (24b37ae)
- Deprecate
bpd.options.bigquery.allow_large_results
in favor ofbpd.options.compute.allow_large_results
(#1597) (18780b4) - Include import statement in the bigframes code snippet (#1699) (08d70b6)
- Include the clean-up step in the udf code snippet (#1698) (48992e2)
- Move multimodal notebook out of experimental folder (#1712) (68b6532)
- Update blob_display option in snippets (#1714) (8b30143)
v2.3.0
2.3.0 (2025-05-06)
Features
Bug Fixes
- Guarantee guid thread safety across threads (#1684) (cb0267d)
- Support large lists of lists in bpd.Series() constructor (#1662) (0f4024c)
- Use value equality to check types for unix epoch functions and timestamp diff (#1690) (81e8fb8)
Performance Improvements
Documentation
v2.2.0
2.2.0 (2025-04-30)
Features
- Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints (#1650) (4fb54df)
- Add GeminiTextGenerator.predict structured output (#1653) (6199023)
- DataFrames.getitem support for slice input (#1668) (563f0cb)
- Print right origin of
PreviewWarning
for thebpd.udf
(#1629) (48d10d1) - Session.bytes_processed_sum will be updated when allow_large_re… (#1669) (ae312db)
- Short circuit query for local scan (#1618) (e84f232)
- Support names parameter in read_csv for bigquery engine (#1659) (3388191)
- Support passing list of values to bigframes.core.sql.simple_literal (#1641) (102d363)
- Support write api as loading option (#1617) (c46ad06)
Bug Fixes
- DataFrame accessors is not pupulated (#1639) (28afa2c)
- Prefer remote schema instead of throwing on materialize conflicts (#1644) (53fc25b)
- Remove itertools.pairwise usage (#1638) (9662745)
- Resolve issue where pre-release versions of google-auth are installed (#1491) (ebb7a5e)
- Resolve some of the typo errors (#1655) (cd7fbde)
Performance Improvements
Dependencies
Documentation
v2.1.0
2.1.0 (2025-04-22)
Features
- Add
bigframes.bigquery.st_distance
function (#1637) (bf1ae70) - Enable local json string validations (#1614) (233347a)
- Enhance
read_csv
index_col
parameter support (#1631) (f4e5b26)
Bug Fixes
- Add retry for test_clean_up_via_context_manager (#1627) (58e7cb0)
- Improve robustness of managed udf code extraction (#1634) (8cc56d5)
Documentation
v2.0.0
2.0.0 (2025-04-17)
⚠ BREAKING CHANGES
- make
dataset
andname
params mandatory inudf
(#1619) - Locational endpoints support is not available in BigFrames 2.0.
- change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
- change default ingress setting for
remote_function
to internal-only (#1544) - make
remote_function
params keyword only (#1537) - make
remote_function
default service account explicit (#1537) - set
allow_large_results=False
by default (#1541)
Features
- Add
on
parameter indataframe.rolling()
anddataframe.groupby.rolling()
(#1556) (45c9d9f) - Add component to manage temporary tables (#1559) (0a4e245)
- Add Series.to_pandas_batches() method (#1592) (09ce979)
- Add support for creating a Matrix Factorization model (#1330) (b5297f9)
- Allow
input_types
,output_type
, anddataset
to be used positionally inremote_function
(#1560) (bcac8c6) - Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
- Change default ingress setting for
remote_function
to internal-only (#1544) (c848a80) - Detect duplicate column/index names in read_gbq before send query. (#1615) (40d6960)
- Drop support for locational endpoints (#1542) (4bf2e43)
- Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy (#1605) (b4b7073)
- Improve local data validation (#1598) (815e471)
- Make
remote_function
default service account explicit (#1537) (9eb9089) - Set
allow_large_results=False
by default (#1541) (e9fb712) - Support bigquery connection in managed function (#1554) (f6f697a)
- Support bq connection path format (#1550) (e7eb918)
- Support gemini-2.0-X models (#1558) (3104fab)
- Support inlining small list, struct, json data (#1589) (2ce891f)
- Support time range rolling on Series. (#1590) (6e98a2c)
- Use session temp tables for all ephemeral storage (#1569) (9711b83)
- Use validated local storage for data uploads (#1612) (aee4159)
- Warn the deprecated
max_download_size
,random_state
andsampling_method
parameters in(DataFrame|Series).to_pandas()
(#1573) (b9623da)
Bug Fixes
to_pandas_batches()
respectspage_size
andmax_results
again (#1572) (27c5905)- Ensure
page_size
works correctly into_pandas_batches
whenmax_results
is not set (#1588) (570cff3) - Include role and service account in IAM exception (#1564) (8c50755)
- Make
dataset
andname
params mandatory inudf
(#1619) (637e860) - Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
- Prevent
KeyError
inbpd.concat
with empty DF and struct/array types DF (#1568) (b4da1cf) - Read_csv supports for tilde local paths and includes index for bigquery_stream write engine (#1580) (352e8e4)
- Use dictionaries to avoid problematic google.iam namespace (#1611) (b03e44f)
Performance Improvements
Dependencies
- Remove jellyfish dependency (#1604) (1ac0e1e)
- Remove parsy dependency (#1610) (293f676)
- Remove test dependency on pytest-mock package (#1622) (1ba72ea)
- Support a shapely versions 1.8.5+ (#1621) (e39ee3b)
Documentation
- Add details for
bigquery_connection
in[@bpd](https://github.com/bpd).udf
docstring ([#1609](https://github.com/googleapis/python-bigq...
v2.0.0.dev0
2.0.0.dev0 (2025-03-31)
⚠ BREAKING CHANGES
- Locational endpoints support is not available in BigFrames 2.0.
- change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator (#1558)
- change default ingress setting for
remote_function
to internal-only (#1544) - make
remote_function
params keyword only (#1537) - make
remote_function
default service account explicit (#1537) - set
allow_large_results=False
by default (#1541)
Features
- Add component to manage temporary tables (#1559) (0a4e245)
- Allow
input_types
,output_type
, anddataset
to be used positionally inremote_function
(#1560) (bcac8c6) - Allow pandas.cut 'labels' parameter to accept a list of string (#1549) (af842b1)
- Change default ingress setting for
remote_function
to internal-only (#1544) (c848a80) - Drop support for locational endpoints (#1542) (4bf2e43)
- Make
remote_function
default service account explicit (#1537) (9eb9089) - Set
allow_large_results=False
by default (#1541) (e9fb712) - Support bigquery connection in managed function (#1554) (f6f697a)
- Support bq connection path format (#1550) (e7eb918)
- Support gemini-2.0-X models (#1558) (3104fab)
Bug Fixes
- Include role and service account in IAM exception (#1564) (8c50755)
- Pandas.cut returns labels index for numeric breaks when labels=False (#1548) (b2375de)
- Prevent
KeyError
inbpd.concat
with empty DF and struct/array types DF (#1568) (b4da1cf)
Documentation
- Add message to remove default model for version 3.0 (#1563) (910be2b)
- Add warning for bigframes 2.0 (#1557) (3f0eaa1)
- Remove gemini-1.5 deprecation warning for
GeminiTextGenerator
(#1562) (0cc6784) - Use restructured text to allow publishing to PyPI (#1565) (d1e9ec2)
Miscellaneous Chores
v1.42.0
1.42.0 (2025-03-27)
Features
- Add
closed
parameter in rolling() (#1539) (8bcc89b) - Add
GeoSeries.difference()
andbigframes.bigquery.st_difference()
(#1471) (e9fe815) - Add
GeoSeries.intersection()
andbigframes.bigquery.st_intersection()
(#1529) (8542bd4) - Add df.take and series.take (#1509) (7d00be6)
- Add Linear_Regression.global_explain() (#1446) (7e5b6a8)
- Allow iloc to support lists of negative indices (#1497) (a9cf215)
- Support dry_run in
to_pandas()
(#1436) (75fc7e0) - Support window partition by geo column (#1512) (bdcb1e7)
- Upgrade BQ managed
udf
to preview (#1536) (4a7fe4d)
Bug Fixes
- Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X (#1534) (c93e720)
- Change the default value for pdf extract/chunk (#1517) (a70a607)
- Local data always has sequential index (#1514) (014bd33)
- Read_pandas inline returns None when exceeds limit (#1525) (578081e)
- Temporary fix for StreamingDataFrame not working backend bug (#1533) (6ab4ffd)
- Tolerate BQ connection service account propagation delay (#1505) (6681f1f)
Performance Improvements
Documentation
v1.41.0
1.41.0 (2025-03-19)
Features
- Add support for the 'right' parameter in 'pandas.cut' (#1496) (8aff128)
- Support BQ managed functions through
read_gbq_function
(#1476) (802183d) - Warn when the BigFrames version is more than a year old (#1455) (00e0750)
Bug Fixes
- Fix pandas.cut errors with empty bins (#1499) (434fb5d)
- Fix read_gbq with ORDER BY query and index_col set (#963) (de46d2f)
Performance Improvements
Documentation
v1.40.0
1.40.0 (2025-03-11)
⚠ BREAKING CHANGES
- reading JSON data as a custom arrow extension type (#1458)
Features
- Reading JSON data as a custom arrow extension type (#1458) (e720f41)
- Support list output for managed function (#1457) (461e9e0)
Bug Fixes
- Fix list-like indexers in partial ordering mode (#1456) (fe72ada)
- Fix the merge issue between 1424 and 1373 (#1461) (7b6e361)
- Use
==
instead ofis
for timedelta type equality checks (#1480) (0db248b)
Performance Improvements
v1.39.0
1.39.0 (2025-03-05)
Features
- (Preview) Support
diff()
for date series (#1423) (521e987) - (Preview) Support aggregations over timedeltas (#1418) (1251ded)
- (Preview) Support arithmetics between dates and timedeltas (#1413) (962b152)
- (Preview) Support automatic load of timedelta from BQ tables. (#1429) (b2917bb)
- Add
allow_large_results
option to many I/O methods. Set toFalse
to reduce latency (#1428) (dd2f488) - Add
GeoSeries.boundary()
(#1435) (32cddfe) - Add allow_large_results to peek (#1448) (67487b9)
- Add groupby.rank() (#1433) (3a633d5)
- Iloc multiple columns selection. (#1437) (ddfd02a)
- Support interface for BigQuery managed functions (#1373) (2bbf53f)
- Warn if default ingress_settings is used in remote_functions (#1419) (dfd891a)
Bug Fixes
- Do not compare schema description during schema validation (#1452) (03a3a56)
- Remove warnings for null index and partial ordering mode in prep for GA (#1431) (6785aee)
- Warn if default
cloud_function_service_account
is used inremote_function
(#1424) (fe7463a) - Window operations over JSON columns (#1451) (0070e77)
- Write chunked text instead of dummy text for pdf chunk (#1444) (96b0e8a)
Performance Improvements
Documentation
Previous Next