Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

amirhfarzaneh/lsoftmax-pytorch

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Pytorch Implementation of L-Softmax

this repository contains a new, clean and enhanced pytorch implementation of L-Softmax proposed in the following paper:

Large-Margin Softmax Loss for Convolutional Neural Networks By Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang [pdf in arxiv] [original CAFFE code by authors]

L-Softmax proposes a modified softmax classification method to increase the inter-class separability and intra-class compactness.

this re-implementation is based on the earlier pytorch implementation here by jihunchoi and borrowing some ideas from its TensorFlow implementation here by auroua. Generally the improvements are as follows:

  • Now features visualization as depicted in the original paper using the vis argument in the code.
  • Cleaner and more readable code
  • More comments in lsoftmax.py file for future readers
  • Variable names are now in better correspondence with the original paper
  • Using the updated PyTorch 0.4.1 syntax and API
  • Two models to produce visualization in paper's fig 2 and the original MNIST model is provided
  • The lambda (beta variable in code) optimization missing in the earlier PyTorch code has been added (refer to section 5.1 in the original paper)
  • The numerical error of torch.acos has been addressed
  • Provided training logs in the Logs folder
  • Some other minor performance improvements

Version compatibility

This code has been tested in Ubuntu 18.04 LTS using PyCharm IDE and a NVIDIA 1080Ti GPU. Here is a list of libraries and their corresponding versions:

python = 3.6
pytorch = 0.4.1
torchvision = 0.2.1
matplotlib = 2.2.2
numpy = 1.14.3
scipy = 1.1.0

Network parameters

  • batch_size = 256
  • max epochs = 100
  • learning rate = 0.1 (0.01 at epoch 50 and 0.001 at epoch 65)
  • SGD with momentum = 0.9
  • weight_decay = 0.0005

Results

Here are the test set visualization results of training the MNIST for different margins: alt text

  • this plot has been generated using the smaller network proposed in the paper for visualization purposes only with batch size = 64, constant learning rate = 0.01 for 10 epochs, and no weight decay regularization.

And here is the tabulated results of training MNIST with the proposed network in the paper:

margin test accuracy paper
m = 1 99.37% 99.60%
m = 2 99.60% 99.68%
m = 3 99.56% 99.69%
m = 4 99.61% 99.69%
  • the test accuracy values are the max test accuracy of running the code only once with the network parameters above!
Morty Proxy This is a proxified and sanitized view of the page, visit original site.