Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 03fe036

Browse filesBrowse files
committed
Merge branch 'for0.20.4' into 0.20.X
2 parents 7b136e9 + c0bd85f commit 03fe036
Copy full SHA for 03fe036

File tree

Expand file treeCollapse file tree

33 files changed

+429
-177
lines changed
Filter options
Expand file treeCollapse file tree

33 files changed

+429
-177
lines changed

‎.circleci/config.yml

Copy file name to clipboardExpand all lines: .circleci/config.yml
+5-5Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,11 @@ jobs:
4242
- MINICONDA_PATH: ~/miniconda
4343
- CONDA_ENV_NAME: testenv
4444
- PYTHON_VERSION: "2"
45-
- NUMPY_VERSION: "1.10"
46-
- SCIPY_VERSION: "0.16"
47-
- MATPLOTLIB_VERSION: "1.4"
48-
- SCIKIT_IMAGE_VERSION: "0.11"
49-
- PANDAS_VERSION: "0.17.1"
45+
- NUMPY_VERSION: "1.*"
46+
- SCIPY_VERSION: "0.*"
47+
- MATPLOTLIB_VERSION: "*"
48+
- SCIKIT_IMAGE_VERSION: "0.*"
49+
- PANDAS_VERSION: "0.*"
5050
steps:
5151
- checkout
5252
- run: ./build_tools/circle/checkout_merge_commit.sh

‎.travis.yml

Copy file name to clipboardExpand all lines: .travis.yml
+3-3Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,12 @@ matrix:
3535
- libatlas-dev
3636
# Python 3.4 build
3737
- env: DISTRIB="conda" PYTHON_VERSION="3.4" INSTALL_MKL="false"
38-
NUMPY_VERSION="1.10.4" SCIPY_VERSION="0.16.1" CYTHON_VERSION="0.25.2"
39-
PILLOW_VERSION="4.0.0" COVERAGE=true
38+
NUMPY_VERSION="1.10.4" SCIPY_VERSION="0.17" CYTHON_VERSION="0.25.2"
39+
PILLOW_VERSION="4.0.0" COVERAGE=
4040
if: type != cron
4141
# Python 3.5 build
4242
- env: DISTRIB="conda" PYTHON_VERSION="3.5" INSTALL_MKL="false"
43-
NUMPY_VERSION="1.10.4" SCIPY_VERSION="0.16.1" CYTHON_VERSION="0.25.2"
43+
NUMPY_VERSION="1.10.4" SCIPY_VERSION="0.17" CYTHON_VERSION="0.25.2"
4444
PILLOW_VERSION="4.0.0" COVERAGE=true
4545
SKLEARN_SITE_JOBLIB=1 JOBLIB_VERSION="0.11"
4646
if: type != cron

‎build_tools/circle/build_doc.sh

Copy file name to clipboardExpand all lines: build_tools/circle/build_doc.sh
+1-1Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ conda update --yes --quiet conda
119119
# provided versions
120120
conda create -n $CONDA_ENV_NAME --yes --quiet python="${PYTHON_VERSION:-*}" \
121121
numpy="${NUMPY_VERSION:-*}" scipy="${SCIPY_VERSION:-*}" cython \
122-
pytest coverage matplotlib="${MATPLOTLIB_VERSION:-*}" sphinx=1.6.2 pillow \
122+
pytest coverage matplotlib="${MATPLOTLIB_VERSION:-*}" sphinx=1.6.* pillow \
123123
scikit-image="${SCIKIT_IMAGE_VERSION:-*}" pandas="${PANDAS_VERSION:-*}" \
124124
joblib
125125

‎build_tools/travis/install.sh

Copy file name to clipboardExpand all lines: build_tools/travis/install.sh
+11-5Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
# matrix entry) from which we pull from local Travis repository. This allows
1212
# us to keep build artefact for gcc + cython, and gain time
1313

14-
set -e
14+
set -ex
1515

1616
echo 'List files from cached directories'
1717
echo 'pip:'
@@ -38,12 +38,18 @@ make_conda() {
3838
export PATH=$MINICONDA_PATH/bin:$PATH
3939
conda update --yes conda
4040

41-
conda create -n testenv --yes $TO_INSTALL
41+
conda create -c conda-forge -n testenv --yes $TO_INSTALL
4242
source activate testenv
4343
}
4444

45+
if [[ "$COVERAGE" == "true" ]]; then
46+
TEST_DEPS="pytest pytest-cov"
47+
else
48+
TEST_DEPS="pytest"
49+
fi
50+
4551
if [[ "$DISTRIB" == "conda" ]]; then
46-
TO_INSTALL="python=$PYTHON_VERSION pip pytest pytest-cov \
52+
TO_INSTALL="python=$PYTHON_VERSION pip $TEST_DEPS \
4753
numpy=$NUMPY_VERSION scipy=$SCIPY_VERSION \
4854
cython=$CYTHON_VERSION"
4955

@@ -84,7 +90,7 @@ elif [[ "$DISTRIB" == "ubuntu" ]]; then
8490
# and scipy
8591
virtualenv --system-site-packages testvenv
8692
source testvenv/bin/activate
87-
pip install pytest pytest-cov cython==$CYTHON_VERSION
93+
pip install $TEST_DEPS cython==$CYTHON_VERSION
8894

8995
elif [[ "$DISTRIB" == "scipy-dev" ]]; then
9096
make_conda python=3.7
@@ -96,7 +102,7 @@ elif [[ "$DISTRIB" == "scipy-dev" ]]; then
96102
echo "Installing joblib master"
97103
pip install https://github.com/joblib/joblib/archive/master.zip
98104
export SKLEARN_SITE_JOBLIB=1
99-
pip install pytest pytest-cov
105+
pip install $TEST_DEPS
100106
fi
101107

102108
if [[ "$COVERAGE" == "true" ]]; then

‎build_tools/travis/test_pytest_soft_dependency.sh

Copy file name to clipboardExpand all lines: build_tools/travis/test_pytest_soft_dependency.sh
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ if [[ "$CHECK_PYTEST_SOFT_DEPENDENCY" == "true" ]]; then
77
if [[ "$COVERAGE" == "true" ]]; then
88
# Need to append the coverage to the existing .coverage generated by
99
# running the tests
10+
pip install coverage
1011
CMD="coverage run --append"
1112
else
1213
CMD="python"

‎doc/index.rst

Copy file name to clipboardExpand all lines: doc/index.rst
+5-5Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -207,20 +207,20 @@
207207
<li><em>On-going development:</em>
208208
<a href="/dev/whats_new.html"><em>What's new</em> (Changelog)</a>
209209
</li>
210-
<li><strong>Scikit-learn 0.21 will drop support for Python 2.7 and Python 3.4.</strong>
210+
<li><strong>Scikit-learn from 0.21 requires Python 3.5 or greater.</strong>
211211
</li>
212-
<li><em>March 2019.</em> scikit-learn 0.20.3 is available for download (<a href="whats_new.html#version-0-20-3">Changelog</a>).
212+
<li><em>July 2019.</em> scikit-learn 0.21.3 (<a href="whats_new.html#version-0-21-3">Changelog</a>) and 0.20.4 (<a href="whats_new.html#version-0-20-4">Changelog</a>) are available for download.
213+
</li>
214+
<li><em>May 2019.</em> scikit-learn 0.21.0 to 0.21.2 are available for download (<a href="whats_new.html#version-0-21">Changelog</a>).
213215
</li>
214-
<li><em>December 2018.</em> scikit-learn 0.20.2 is available for download (<a href="whats_new.html#version-0-20-2">Changelog</a>)
216+
<li><em>March 2019.</em> scikit-learn 0.20.3 is available for download (<a href="whats_new.html#version-0-20-3">Changelog</a>).
215217
</li>
216218
<li><em>September 2018.</em> scikit-learn 0.20.0 is available for download (<a href="whats_new.html#version-0-20-0">Changelog</a>).
217219
</li>
218220
<li><em>July 2018.</em> scikit-learn 0.19.2 is available for download (<a href="whats_new.html#version-0-19">Changelog</a>).
219221
</li>
220222
<li><em>July 2017.</em> scikit-learn 0.19.0 is available for download (<a href="whats_new/v0.19.html#version-0-19">Changelog</a>).
221223
</li>
222-
<li><em>June 2017.</em> scikit-learn 0.18.2 is available for download (<a href="whats_new/v0.18.html#version-0-18-2">Changelog</a>).
223-
</li>
224224
</ul>
225225
</div>
226226

‎doc/whats_new/v0.20.rst

Copy file name to clipboardExpand all lines: doc/whats_new/v0.20.rst
+45-1Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,50 @@
22

33
.. currentmodule:: sklearn
44

5+
.. _changes_0_20_4:
6+
7+
Version 0.20.4
8+
==============
9+
10+
**July 30, 2019**
11+
12+
This is a bug-fix release with some bug fixes applied to version 0.20.3.
13+
14+
Changelog
15+
---------
16+
17+
The bundled version of joblib was upgraded from 0.13.0 to 0.13.2.
18+
19+
:mod:`sklearn.cluster`
20+
..............................
21+
22+
- |Fix| Fixed a bug in :class:`cluster.KMeans` where KMeans++ initialisation
23+
could rarely result in an IndexError. :issue:`11756` by `Joel Nothman`_.
24+
25+
:mod:`sklearn.compose`
26+
.....................
27+
28+
- |Fix| Fixed an issue in :class:`compose.ColumnTransformer` where using
29+
DataFrames whose column order differs between :func:``fit`` and
30+
:func:``transform`` could lead to silently passing incorrect columns to the
31+
``remainder`` transformer.
32+
:pr:`14237` by `Andreas Schuderer <schuderer>`.
33+
34+
:mod:`sklearn.model_selection`
35+
..............................
36+
37+
- |Fix| Fixed a bug where :class:`model_selection.StratifiedKFold`
38+
shuffles each class's samples with the same ``random_state``,
39+
making ``shuffle=True`` ineffective.
40+
:issue:`13124` by :user:`Hanmin Qin <qinhanmin2014>`.
41+
42+
:mod:`sklearn.neighbors`
43+
......................
44+
45+
- |Fix| Fixed a bug in :class:`neighbors.KernelDensity` which could not be
46+
restored from a pickle if ``sample_weight`` had been used.
47+
:issue:`13772` by :user:`Aditya Vyas <aditya1702>`.
48+
549
.. _changes_0_20_3:
650

751
Version 0.20.3
@@ -30,7 +74,7 @@ Changelog
3074
:issue:`12946` by :user:`Pierre Tallotte <pierretallotte>`.
3175

3276
:mod:`sklearn.covariance`
33-
......................
77+
.........................
3478

3579
- |Fix| Fixed a regression in :func:`covariance.graphical_lasso` so that
3680
the case `n_features=2` is handled correctly. :issue:`13276` by

‎sklearn/__init__.py

Copy file name to clipboardExpand all lines: sklearn/__init__.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@
4444
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
4545
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
4646
#
47-
__version__ = '0.20.3'
47+
__version__ = '0.20.4'
4848

4949

5050
try:

‎sklearn/cluster/dbscan_.py

Copy file name to clipboardExpand all lines: sklearn/cluster/dbscan_.py
+3-2Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@
1010
# License: BSD 3 clause
1111

1212
import numpy as np
13+
import warnings
1314
from scipy import sparse
1415

1516
from ..base import BaseEstimator, ClusterMixin
1617
from ..utils import check_array, check_consistent_length
17-
from ..utils.testing import ignore_warnings
1818
from ..neighbors import NearestNeighbors
1919

2020
from ._dbscan_inner import dbscan_inner
@@ -139,7 +139,8 @@ def dbscan(X, eps=0.5, min_samples=5, metric='minkowski', metric_params=None,
139139
X.sum_duplicates() # XXX: modifies X's internals in-place
140140

141141
# set the diagonal to explicit values, as a point is its own neighbor
142-
with ignore_warnings():
142+
with warnings.catch_warnings():
143+
warnings.simplefilter('ignore', sparse.SparseEfficiencyWarning)
143144
X.setdiag(X.diagonal()) # XXX: modifies X's internals in-place
144145

145146
X_mask = X.data <= eps

‎sklearn/cluster/k_means_.py

Copy file name to clipboardExpand all lines: sklearn/cluster/k_means_.py
+3Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,9 @@ def _k_init(X, n_clusters, x_squared_norms, random_state, n_local_trials=None):
111111
rand_vals = random_state.random_sample(n_local_trials) * current_pot
112112
candidate_ids = np.searchsorted(stable_cumsum(closest_dist_sq),
113113
rand_vals)
114+
# XXX: numerical imprecision can result in a candidate_id out of range
115+
np.clip(candidate_ids, None, closest_dist_sq.size - 1,
116+
out=candidate_ids)
114117

115118
# Compute distances to center candidates
116119
distance_to_candidates = euclidean_distances(

‎sklearn/compose/_column_transformer.py

Copy file name to clipboardExpand all lines: sklearn/compose/_column_transformer.py
+31-4Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,9 @@ class ColumnTransformer(_BaseComposition, TransformerMixin):
8383
the transformers.
8484
By setting ``remainder`` to be an estimator, the remaining
8585
non-specified columns will use the ``remainder`` estimator. The
86-
estimator must support `fit` and `transform`.
86+
estimator must support :term:`fit` and :term:`transform`.
87+
Note that using this feature requires that the DataFrame columns
88+
input at :term:`fit` and :term:`transform` have identical order.
8789
8890
sparse_threshold : float, default = 0.3
8991
If the output of the different transfromers contains sparse matrices,
@@ -295,11 +297,17 @@ def _validate_remainder(self, X):
295297
"'passthrough', or estimator. '%s' was passed instead" %
296298
self.remainder)
297299

298-
n_columns = X.shape[1]
300+
# Make it possible to check for reordered named columns on transform
301+
if (hasattr(X, 'columns') and
302+
any(_check_key_type(cols, str) for cols in self._columns)):
303+
self._df_columns = X.columns
304+
305+
self._n_features = X.shape[1]
299306
cols = []
300307
for columns in self._columns:
301308
cols.extend(_get_column_indices(X, columns))
302-
remaining_idx = sorted(list(set(range(n_columns)) - set(cols))) or None
309+
remaining_idx = list(set(range(self._n_features)) - set(cols))
310+
remaining_idx = sorted(remaining_idx) or None
303311

304312
self._remainder = ('remainder', self.remainder, remaining_idx)
305313

@@ -488,8 +496,27 @@ def transform(self, X):
488496
489497
"""
490498
check_is_fitted(self, 'transformers_')
491-
492499
X = _check_X(X)
500+
501+
if self._n_features > X.shape[1]:
502+
raise ValueError('Number of features of the input must be equal '
503+
'to or greater than that of the fitted '
504+
'transformer. Transformer n_features is {0} '
505+
'and input n_features is {1}.'
506+
.format(self._n_features, X.shape[1]))
507+
508+
# No column reordering allowed for named cols combined with remainder
509+
if (self._remainder[2] is not None and
510+
hasattr(self, '_df_columns') and
511+
hasattr(X, 'columns')):
512+
n_cols_fit = len(self._df_columns)
513+
n_cols_transform = len(X.columns)
514+
if (n_cols_transform >= n_cols_fit and
515+
any(X.columns[:n_cols_fit] != self._df_columns)):
516+
raise ValueError('Column ordering must be equal for fit '
517+
'and for transform when using the '
518+
'remainder keyword')
519+
493520
Xs = self._fit_transform(X, None, _transform_one, fitted=True)
494521
self._validate_output(Xs)
495522

‎sklearn/compose/tests/test_column_transformer.py

Copy file name to clipboardExpand all lines: sklearn/compose/tests/test_column_transformer.py
+48Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,17 @@ def test_column_transformer_invalid_columns(remainder):
498498
assert_raise_message(ValueError, "Specifying the columns",
499499
ct.fit, X_array)
500500

501+
# transformed n_features does not match fitted n_features
502+
col = [0, 1]
503+
ct = ColumnTransformer([('trans', Trans(), col)], remainder=remainder)
504+
ct.fit(X_array)
505+
X_array_more = np.array([[0, 1, 2], [2, 4, 6], [3, 6, 9]]).T
506+
ct.transform(X_array_more) # Should accept added columns
507+
X_array_fewer = np.array([[0, 1, 2], ]).T
508+
err_msg = 'Number of features'
509+
with pytest.raises(ValueError, match=err_msg):
510+
ct.transform(X_array_fewer)
511+
501512

502513
def test_column_transformer_invalid_transformer():
503514

@@ -1033,3 +1044,40 @@ def test_column_transformer_negative_column_indexes():
10331044
tf_1 = ColumnTransformer([('ohe', ohe, [-1])], remainder='passthrough')
10341045
tf_2 = ColumnTransformer([('ohe', ohe, [2])], remainder='passthrough')
10351046
assert_array_equal(tf_1.fit_transform(X), tf_2.fit_transform(X))
1047+
1048+
1049+
@pytest.mark.parametrize("explicit_colname", ['first', 'second'])
1050+
def test_column_transformer_reordered_column_names_remainder(explicit_colname):
1051+
"""Regression test for issue #14223: 'Named col indexing fails with
1052+
ColumnTransformer remainder on changing DataFrame column ordering'
1053+
1054+
Should raise error on changed order combined with remainder.
1055+
Should allow for added columns in `transform` input DataFrame
1056+
as long as all preceding columns match.
1057+
"""
1058+
pd = pytest.importorskip('pandas')
1059+
1060+
X_fit_array = np.array([[0, 1, 2], [2, 4, 6]]).T
1061+
X_fit_df = pd.DataFrame(X_fit_array, columns=['first', 'second'])
1062+
1063+
X_trans_array = np.array([[2, 4, 6], [0, 1, 2]]).T
1064+
X_trans_df = pd.DataFrame(X_trans_array, columns=['second', 'first'])
1065+
1066+
tf = ColumnTransformer([('bycol', Trans(), explicit_colname)],
1067+
remainder=Trans())
1068+
1069+
tf.fit(X_fit_df)
1070+
err_msg = 'Column ordering must be equal'
1071+
with pytest.raises(ValueError, match=err_msg):
1072+
tf.transform(X_trans_df)
1073+
1074+
# No error for added columns if ordering is identical
1075+
X_extended_df = X_fit_df.copy()
1076+
X_extended_df['third'] = [3, 6, 9]
1077+
tf.transform(X_extended_df) # No error should be raised
1078+
1079+
# No 'columns' AttributeError when transform input is a numpy array
1080+
X_array = X_fit_array.copy()
1081+
err_msg = 'Specifying the columns'
1082+
with pytest.raises(ValueError, match=err_msg):
1083+
tf.transform(X_array)

‎sklearn/cross_decomposition/tests/test_pls.py

Copy file name to clipboardExpand all lines: sklearn/cross_decomposition/tests/test_pls.py
+4-4Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -357,13 +357,13 @@ def test_scale_and_stability():
357357
X_score, Y_score = clf.fit_transform(X, Y)
358358
clf.set_params(scale=False)
359359
X_s_score, Y_s_score = clf.fit_transform(X_s, Y_s)
360-
assert_array_almost_equal(X_s_score, X_score)
361-
assert_array_almost_equal(Y_s_score, Y_score)
360+
assert_array_almost_equal(X_s_score, X_score, decimal=4)
361+
assert_array_almost_equal(Y_s_score, Y_score, decimal=4)
362362
# Scaling should be idempotent
363363
clf.set_params(scale=True)
364364
X_score, Y_score = clf.fit_transform(X_s, Y_s)
365-
assert_array_almost_equal(X_s_score, X_score)
366-
assert_array_almost_equal(Y_s_score, Y_score)
365+
assert_array_almost_equal(X_s_score, X_score, decimal=4)
366+
assert_array_almost_equal(Y_s_score, Y_score, decimal=4)
367367

368368

369369
def test_pls_errors():

‎sklearn/datasets/svmlight_format.py

Copy file name to clipboardExpand all lines: sklearn/datasets/svmlight_format.py
+2-2Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,8 +134,8 @@ def load_svmlight_file(f, n_features=None, dtype=np.float64,
134134
135135
See also
136136
--------
137-
load_svmlight_files: similar function for loading multiple files in this
138-
format, enforcing the same number of features/columns on all of them.
137+
load_svmlight_files : similar function for loading multiple files in this
138+
format, enforcing the same number of features/columns on all of them.
139139
140140
Examples
141141
--------

‎sklearn/externals/joblib/__init__.py

Copy file name to clipboardExpand all lines: sklearn/externals/joblib/__init__.py
+2-2Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
==================== ===============================================
1515
**Documentation:** https://joblib.readthedocs.io
1616
17-
**Download:** http://pypi.python.org/pypi/joblib#downloads
17+
**Download:** https://pypi.python.org/pypi/joblib#downloads
1818
1919
**Source code:** https://github.com/joblib/joblib
2020
@@ -106,7 +106,7 @@
106106
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
107107
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
108108
#
109-
__version__ = '0.13.0'
109+
__version__ = '0.13.2'
110110

111111

112112
from .memory import Memory, MemorizedResult, register_store_backend

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.