This GitHub repository contains the steps enumerated for analyzing characteristics of animals and creating classifications.
Data available at: https://www.kaggle.com/uciml/zoo-animal-classification/data
The zoo.csv data contains 18 variables with 101 rows. These animals are in 7 classes and the class information is available in class.csv.
To run the code, you need the packages numpy, scipy, scikit-learn, matplotlib, pandas and seaborn, also pydotplus for decition tree graph.
In order to truly apply animal classification, we used a few machine learning models to do classifications and then perform a battle to see which model has the best accuracy score. Here used Decision tree, Random Forest, K-Nearest Neighbors, Support Vector Machine and Perceptron models.
Random Forest model wins with a 1.0 accuracy score and 0.96 of cross validation.
Xiaochi Ge, Yijia Chen, Yupeng Yang