This project demonstrates how to use PyTorch and torchvision to classify pet images from the Oxford-IIIT Pet Dataset using a deep learning model (ReXNet) with transfer learning. It includes data visualization, class balancing insight, training monitoring, and model explainability with Grad-CAM.
We use the Oxford-IIIT Pet Dataset, which contains 37 categories (breeds of cats and dogs), with roughly 200 images for each class.
- ✅ Training Set
- 🔍 Validation Set
- 🧪 Test Set
-
Data Loading & Preprocessing
- Loading the dataset using
torchvision.datasets.OxfordIIITPet
. - Applying necessary transformations (resizing, normalization, etc.).
- Loading the dataset using
-
Data Visualization
- ⚖️ Class Distribution Charts – visualize imbalance across train, validation, and test sets.
We use ReXNet v1.5, a lightweight convolutional neural network optimized for efficiency and performance:
- Backbone:
rexnet_150
(loaded fromtimm
) - Pretrained: Yes (
imagenet
) - Final Layer: Modified to match the number of pet classes (
num_classes=37
) - Loss:
CrossEntropyLoss
- Metrics: Accuracy and
F1Score
(torchmetrics.F1Score
withmulticlass
)
The training pipeline is built using a custom TrainValidation
class:
- Optimizer:
Adam
with learning rate =3e-4
- Scheduler:
ReduceLROnPlateau
(monitors validation loss) - Early Stopping: Stops training if F1-score doesn’t improve for 3 epochs
- Epochs: 25 (with early stopping)
- Best Model Saving: Based on highest F1-Score on validation set
- Device: GPU (
cuda
) or CPU fallback - Dev Mode: For debugging small runs with
dev_mode=True
- Clone the Repository
git clone https://github.com/matrasulov/Oxford-IIIT-Pet-Dataset---Image-Classification-using-PyTorch.git cd Oxford-IIIT-Pet-Dataset---Image-Classification-using-PyTorch