Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Hotfix/arff#1388

Merged
PGijsbers merged 3 commits intodevelopopenml/openml-python:developfrom
hotfix/arffopenml/openml-python:hotfix/arffCopy head branch name to clipboard
Jan 25, 2025
Merged

Hotfix/arff#1388
PGijsbers merged 3 commits intodevelopopenml/openml-python:developfrom
hotfix/arffopenml/openml-python:hotfix/arffCopy head branch name to clipboard

Conversation

@PGijsbers
Copy link
Collaborator

@PGijsbers PGijsbers commented Jan 25, 2025

This introduces a backdoor for disabling attempting to download parquet files through the OPENML_SKIP_PQ variable.
If OPENML_SKIP_PQ is set to true (case insensitive), then parquet files will not be downloaded in the get_dataset and OpenMLDataset.get_data calls.

The PR also fixes a bug where an error would be raised if the parquet file failed to download. The str(_get_dataset_parquet_file(self)) would return None if it failed to download, but by converting it to a string the next check is always false and thus it would not fall back to arff.

The reason for using an environment variable:

  • It's a bit quicker to implement than a configuration option, and it makes it easier to turn it on for a single call.
  • Compared to adding function arguments, the environment variable (or configuration option) don't need changes to existing scripts. Making it easier for people to start using it now, and stop using it later. We can issue a warning if the environment variable remains set in a later release.

openml/datasets/dataset.py Outdated Show resolved Hide resolved
openml/datasets/functions.py Outdated Show resolved Hide resolved
Copy link
Contributor

@LennartPurucker LennartPurucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with comments from Jos

@PGijsbers PGijsbers merged commit cc28b1d into develop Jan 25, 2025
1 of 12 checks passed
@PGijsbers PGijsbers deleted the hotfix/arff branch January 25, 2025 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.