Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

feat: (Preview) Support arithmetics between dates and timedeltas #1413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Feb 26, 2025

Conversation

sycai
Copy link
Contributor

@sycai sycai commented Feb 20, 2025

Supported operations:

  • date - date
  • timedelta + date
  • date + timedelta
  • date - timedelta

Test cases date series + timedelta series and date series - timedelta series are not supported by Pandas until 2.1.0, but our test setup uses version 1.5.3. However, I cannot increase the pandas version accordingly in test env 3.12 because it would introduce more breaking changes in irrelevant tests.

That being said, operations between a series and a literal are still covered, and we will fill the gap once our Pandas dependency is increased to 2.1.0

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Feb 20, 2025
@sycai sycai changed the title [WIP] Implement date_diff, date_add, and date_sub. feat: (Preview) Support arithmetics between dates and timedeltas Feb 21, 2025
@sycai sycai marked this pull request as ready for review February 25, 2025 20:18
@sycai sycai requested review from a team as code owners February 25, 2025 20:18
@sycai sycai requested a review from ZehaoXU February 25, 2025 20:18
@sycai sycai requested review from tswast, TrevorBergeron and chelsea-lin and removed request for ZehaoXU February 25, 2025 20:18
@@ -740,6 +740,21 @@ def timestamp_sub_op_impl(x: ibis_types.TimestampValue, y: ibis_types.IntegerVal
return x - y.to_interval("us")


@scalar_op_compiler.register_binary_op(ops.date_diff_op)
def date_diff_op_impl(x: ibis_types.DateValue, y: ibis_types.DateValue):
return (x.delta(y, "day") * UNIT_TO_US_CONVERSION_FACTORS["d"]).floor() # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the floor() needed for? don't we already have an integer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

UNIT_TO_US_CONVERSION_FACTORS["d"] has float type even though the value itself is an integer, so the multiplication gives us an Ibis.FloatValue. Calling floor() over the value was to cast it into Ibis.IntValue.

That said, floor() does seem unnecessary here. I used int(..) to cast the factor value instead, as it simplifies the generated SQL a bit.


@scalar_op_compiler.register_binary_op(ops.date_add_op)
def date_add_op_impl(x: ibis_types.DateValue, y: ibis_types.IntegerValue):
return x.cast("timestamp") + y.to_interval("us") # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we convert to interval datatype? I would rather just use https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions#date_add with an integer.

Copy link
Contributor Author

@sycai sycai Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the interval conversion because this is how Ibis generates date_add expressions. In other words, Ibis only recognizes DateValue + IntervalValue to get DATE_ADD(Date, Interval ...)


@scalar_op_compiler.register_binary_op(ops.date_sub_op)
def date_sub_op_impl(x: ibis_types.DateValue, y: ibis_types.IntegerValue):
return x.cast("timestamp") - y.to_interval("us") # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, would rather avoid casting to interval type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above.

@sycai sycai enabled auto-merge (squash) February 26, 2025 20:58
@sycai sycai merged commit 962b152 into main Feb 26, 2025
23 checks passed
@sycai sycai deleted the sycai_date_arithmetics branch February 26, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.