forked from openml/openml-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsuites_tutorial.py
More file actions
92 lines (68 loc) 路 2.09 KB
/
suites_tutorial.py
File metadata and controls
92 lines (68 loc) 路 2.09 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# %% [markdown]
# How to list, download and upload benchmark suites.
# %%
import uuid
import numpy as np
import openml
# %% [markdown]
# ## Listing suites
#
# * Use the output_format parameter to select output type
# * Default gives ``dict``, but we'll use ``dataframe`` to obtain an
# easier-to-work-with data structure
# %%
suites = openml.study.list_suites(status="all")
print(suites.head(n=10))
# %% [markdown]
# ## Downloading suites
# This is done based on the dataset ID.
# %%
suite = openml.study.get_suite(99)
print(suite)
# %% [markdown]
# Suites also feature a description:
# %%
print(suite.description)
# %% [markdown]
# Suites are a container for tasks:
# %%
print(suite.tasks)
# %% [markdown]
# And we can use the task listing functionality to learn more about them:
# %%
tasks = openml.tasks.list_tasks()
# %% [markdown]
# Using ``@`` in
# [pd.DataFrame.query](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html)
# accesses variables outside of the current dataframe.
# %%
tasks = tasks.query("tid in @suite.tasks")
print(tasks.describe().transpose())
# %% [markdown]
# We'll use the test server for the rest of this tutorial.
# %%
openml.config.start_using_configuration_for_example()
# %% [markdown]
# ## Uploading suites
#
# Uploading suites is as simple as uploading any kind of other OpenML
# entity - the only reason why we need so much code in this example is
# because we upload some random data.
# We'll take a random subset of at least ten tasks of all available tasks on
# the test server:
# %%
all_tasks = list(openml.tasks.list_tasks()["tid"])
task_ids_for_suite = sorted(np.random.choice(all_tasks, replace=False, size=20))
# The study needs a machine-readable and unique alias. To obtain this,
# we simply generate a random uuid.
alias = uuid.uuid4().hex
new_suite = openml.study.create_benchmark_suite(
name="Test-Suite",
description="Test suite for the Python tutorial on benchmark suites",
task_ids=task_ids_for_suite,
alias=alias,
)
new_suite.publish()
print(new_suite)
# %%
openml.config.stop_using_configuration_for_example()