Description
Unless I'm missing something, it's not completely trivial how one can use a custom sklearn.tree._criterion.Criterion
for a decision tree. See my use case here.
Things I have tried include:
-
Import the
ClassificationCriterion
in Python and subclass it. It seems thatnode_impurity
andchildren_impurity
do not get called, the impurity is always 0 (perhaps because they arecdef
and notcpdef
?). I'm also unsure what the parameters to__new__
/__cinit__
should be (e.g.1
andnp.array([2], dtype='intp')
for a binary classification problem?), or how to pass them properly: I have to create theCriterion
object from outside the tree to circumvent the check on thecriterion
argument. -
Extend
ClassificationCriterion
in a Cython file. This seems to work, but (a) it requires exportingClassificationCriterion
from_criterion.pxd
and (b) it would be nice if it would be documented more extensively what should be done innode_impurity
andchildren_impurity
. I will post my code below once it seems to work correctly.
May I propose one of the following to make this easier?
- Document what should be done to extend the class in Cython or Python - if Python should be allowed: I am aware of the performance issue with that, but in some cases it may be OK to do this in Python - I don't know.
- Make it possible to pass a function or other object not extending
Criterion
to the tree, similar to how it is very easy to implement a custom scorer for validation functions. That would require changing the checks here.