Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Latest commit

 

History

History
History
 
 

Chapter 2

Support code for Chapter 2: Learning How to Classify with Real-world Examples. The directory data contains the seeds dataset, originally downloaded from https://archive.ics.uci.edu/ml/datasets/seeds

chapter.py
The code as printed in the book.
figure1.py
Figure 1 in the book: all 2-by-2 scatter plots
figure2.py
Figure 2 in the book: threshold & decision area
figure4_5_sklearn.py
Figures 4 and 5 in the book: Knn decision borders before and after feature normalization. This also produces a version of the figure using 11 neighbors (not in the book), which shows that the result is smoother, not as sensitive to exact positions of each datapoint.
figure4_5_no_sklearn.py
Alternative code for Figures 4 and 5 without using scikit-learn
load.py
Code to load the seeds data
simple_threshold.py
Code from the book: finds the first partition, between Setosa and the other classes.
stump.py
Code from the book: finds the second partition, between Virginica and Versicolor.
threshold.py
Functional implementation of a threshold classifier
heldout.py
Evalute the threshold model on heldout data
seeds_knn_sklearn.py
Demonstrate cross-validation and feature normalization using scikit-learn
seeds_threshold.py
Test thresholding model on the seeds dataset (result mention in book, but no code)
seeds_knn_increasing_k.py
Test effect of increasing num_neighbors on accuracy.
knn.py
Implementation of K-Nearest neighbor without using scikit-learn.
seeds_knn.py
Demonstrate cross-validation (without scikit-learn)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.