lambda-ml.decision-tree

Decision tree learning using the Classification and Regression Trees (CART) algorithm.

Example usage;

(def data [[0 0 0] [0 1 1] [1 0 1] [1 1 0]])
(def fit
  (let [min-split 2
        min-leaf 1
        max-features 2]
    (-> (make-classification-tree gini-impurity min-split min-leaf max-features)
        (decision-tree-fit data))))
(decision-tree-predict fit (map butlast data))
;;=> (0 1 1 0)

best-splitter

(best-splitter model x y)

Returns the splitter for the given data that minimizes a weighted cost function, or returns nil if no splitter exists.

categorical-partitions

(categorical-partitions vals)

Given a seq of k distinct values, returns the 2^{k-1}-1 possible binary partitions of the values into sets.

classification-weighted-cost

(classification-weighted-cost y1 y2 f g)

decision-tree-fit

(decision-tree-fit model data)(decision-tree-fit model x y)

Fits a decision tree to the given training data.

decision-tree-predict

(decision-tree-predict model x)

Predicts the values of example data using a decision tree.

gini-impurity

(gini-impurity labels)

Returns the Gini impurity of a seq of labels.

make-classification-tree

(make-classification-tree cost min-split min-leaf max-features)

Returns a classification decision tree model using the given cost function.

make-regression-tree

(make-regression-tree cost min-split min-leaf max-features)

Returns a regression decision tree model using the given cost function.

mean-squared-error

(mean-squared-error labels predictions)

Returns the mean squared error for a seq of predictions.

numeric-partitions

(numeric-partitions vals)

Given a seq of k distinct numeric values, returns k-1 possible binary partitions of the values by taking the average of consecutive elements in the sorted seq of values.

print-decision-tree

(print-decision-tree model)

Prints information about a given decision tree.

regression-weighted-cost

(regression-weighted-cost y1 y2 f g)

splitters

(splitters x i)

Returns a seq of all possible splitters for feature i. A splitter is a predicate function that evaluates to true if an example belongs in the left subtree, or false if an example belongs in the right subtree, based on the splitting criterion.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.