-
Notifications
You must be signed in to change notification settings - Fork 50
refactor: Extract data loading logic into class #913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a large PR. Do we have a bug number or any other contexts? What are we trying to solve here?
I do not have a bug number. Here is a previous commit in this effort: 92fdb93. The goal has been to make Session easier to understand by splitting it's state and functionality into distinct components. Mostly just moving code to new files right now. The hope is by creating some smaller components, it will be easier to optimize each component. Additionally, I do want to try making session thread-safe at some point, and pushing session state into individual objects will help with setting up necessary locks. |
bigframes/session/loader.py
Outdated
scan_index_uniqueness: bool, | ||
metrics: Optional[bigframes.session.metrics.ExecutionMetrics] = None, | ||
): | ||
self.bqclient = bqclient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some members are "private" while others are public. Are they really different or they can be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made them all private
bqclient: bigquery.Client, | ||
storage_manager: bigframes.session.temp_storage.TemporaryGbqStorageManager, | ||
default_index_type: bigframes.enums.DefaultIndexKind, | ||
scan_index_uniqueness: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add docs for the args? At least scan_index_uniqueness and metrics aren't clear at first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
bigframes/session/executor.py
Outdated
array_value: bigframes.core.ArrayValue, | ||
col_id_overrides: Mapping[str, str], | ||
uri: str, | ||
format: Literal["JSON", "CSV", "PARQUET"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Python usually takes lower cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to lower case
bigframes/session/executor.py
Outdated
col_id_overrides: Mapping[str, str], | ||
uri: str, | ||
format: Literal["JSON", "CSV", "PARQUET"], | ||
export_options: Dict[str, Union[bool, str]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance to make these consistent? Mapping and Dict. Prefer Mapping as a more general type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to use mapping for both
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕