Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH fetch_file to fetch data files by URL with retries, checksuming and local caching #29354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jul 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
6119be4
ENH fetch_file to fetch data files by URL with retries, checksuming a…
ogrisel Jun 26, 2024
a4c456d
TST add tests for fetch_file, with and without SHA256 checks
ogrisel Jun 27, 2024
d8bd174
Improve docstring
ogrisel Jun 27, 2024
ff808d3
Test fetch_file's use of get_data_home
ogrisel Jun 27, 2024
8facede
Merge branch 'main' into fetch_file
ogrisel Jun 27, 2024
219a077
Add changelog entry
ogrisel Jun 27, 2024
b969c85
Fix PR number in changelog entry...
ogrisel Jun 27, 2024
b6900a1
Close the temp file earlier to make Windows happier?
ogrisel Jun 27, 2024
77ce36e
Update example on feature engineering with Polars for bike sharing de…
ogrisel Jun 27, 2024
c7a35e0
Make expected warning message OS independent
ogrisel Jun 27, 2024
c09ddff
Shorter warning message
ogrisel Jun 27, 2024
8368fe7
Improve phrasing in the first example cell
ogrisel Jun 27, 2024
5aa07b7
Simplify _slugify
ogrisel Jun 27, 2024
5da0feb
Apply suggestions from code review
ogrisel Jun 27, 2024
b9df5b2
Add URL to openml UI for the Bike Sharing Dataset
ogrisel Jun 27, 2024
4e50efa
Explain the use of the sha256 argument
ogrisel Jun 27, 2024
fa429e1
Trim useless empty cell.
ogrisel Jun 27, 2024
811dc3c
Trailing line
ogrisel Jun 27, 2024
46756de
Merge branch 'main' into fetch_file
ogrisel Jun 28, 2024
91812b5
Update the docstring of _slugify to better describe the actual behavior
ogrisel Jul 5, 2024
f465998
Improve _derive_folder_and_filename_from_url safety based on feedback…
ogrisel Jul 5, 2024
8c2120a
Simpler handling of repetition + fix bug in replacing white spaces as…
ogrisel Jul 5, 2024
f14235d
Empty commit to trigger PR update
ogrisel Jul 5, 2024
4854eba
Test .. explicitly and more stripping patterns
ogrisel Jul 5, 2024
c78019f
Better test what Adrin actually suggested
ogrisel Jul 5, 2024
f55da8c
Better corrupted contents.
ogrisel Jul 9, 2024
e6e35a1
Mention slugify in the docstring.
ogrisel Jul 9, 2024
94e5563
Explain the logic behid manual deletion and renaming of the temporary…
ogrisel Jul 10, 2024
82211b3
Also clean the tempfile in case of ctrl-c
ogrisel Jul 10, 2024
bed8515
Clarify the relationship between delete=False and temp_file.close()
ogrisel Jul 11, 2024
4fc5c10
lint
lesteve Jul 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions 9 doc/whats_new/v1.6.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,15 @@ Changelog
on the input data.
:pr:`29124` by :user:`Yao Xiao <Charlie-XIAO>`.


:mod:`sklearn.datasets`
.......................

- |Feature| :func:`datasets.fetch_file` allows downloading arbitrary data-file
from the web. It handles local caching, integrity checks with SHA256 digests
and automatic retries in case of HTTP errors. :pr:`29354` by :user:`Olivier
Grisel <ogrisel>`.

:mod:`sklearn.discriminant_analysis`
....................................

Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.