Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Segfault when passing Criterion object to Forest ensembles with n_jobs>1 #12623

Copy link
Copy link
Closed
@wyegelwel

Description

@wyegelwel
Issue body actions

Description

When passing in a Criterion object to RandomForest or ExtraTrees as opposed to a Criterion string, I've observed segfaults when fitting when n_jobs is > 1. In my case, I've written a custom Criterion, but can reproduce the problem with one of the sklearn built in criterions if you pass in the Criterion object instead of the string.

I believe the problem is that when creating the list of estimators for the ensemble, the parameters aren't copied so that the same Criterion object is used for all the trees. When n_jobs=1, this is ok because the criterion is re-initialized at each split. However, when n_jobs>1, the same criterion is modified by multiple threads resulting in cases where pointers are freed and then accessed.

Steps/Code to Reproduce

The following code reproduces the segfault:

from sklearn.ensemble import ExtraTreesRegressor
from sklearn.tree.tree import CRITERIA_REG
import numpy as np

X = np.random.random((1000, 3))
y = np.random.random((1000, 1))

n_samples, n_outputs = y.shape
mse_criterion = CRITERIA_REG['mse'](n_outputs, n_samples)
rf = ExtraTreesRegressor(n_estimators=400, n_jobs=-1, criterion=mse_criterion)

rf.fit(X,y)

Versions

System

python: 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56)  [GCC 7.3.0]

Python deps

sklearn: 0.20.0
setuptools: 40.2.0
pip: 10.0.1
Cython: 0.28.5
numpy: 1.13.3
pandas: 0.23.4
scipy: 1.1.0

Discussion

I've tried adding a call to copy.deepcopy() around the getattr call for all the parameters accessed when making the estimators to fit which seems to fix the problem. Would that be an acceptable fix or are you interested in a deeper fix?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.