Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

MNIST classification using Multi-Layer Perceptron (MLP) with 2 hidden layers. Some weight-initializers and batch-normalization are implemented.

Notifications You must be signed in to change notification settings

hwalsuklee/tensorflow-mnist-MLP-batch_normalization-weight_initializers

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Various initializers and batch normalization

An implementation of weight/bias initializers and batch normalization in Tensorflow.

MNIST database is used to show performance-comparison

Network architecture

In order to examine the effect of initializers and batch normalization, a simple network architecture called multilayer perceptrons (MLP) is employed.

MLP has following architecture.

  • input layer : 784 nodes (MNIST images size)
  • first hidden layer : 256 nodes
  • second hidden layer : 256 nodes
  • output layer : 10 nodes (number of class for MNIST)

Various initializers

The following initializers for weights/biases of network are considered.

Simulation results

  • Weight Initializer : he > trauncated normal = xaiver > normal
  • Bias Initilaizer : zero > normal

Sample results are following.

weight_init

Index Weight Initializer Bias Initializer Accuracy
normal_w_normal_b_0.9451 normal normal 0.9451
normal_w_zero_b_0.9485 normal zero 0.9485
truncated_normal_w_normal_b_0.9788 truncated_normal normal 0.9788
truncated_normal_w_zero_b_0.9790 truncated_normal zero 0.9790
xavier_w_normal_b_0.9800 xavier normal 0.9800
xavier_w_zero_b_0.9806 xavier zero 0.9806
he_w_normal_b_0.9798 he normal 0.9798
he_w_zero_b_0.9811 he zero 0.9811

Batch normalization

Batch normalization improves performance of network in terms of final accuracy and convergence rate.
In this simulation, bias initalizer was zero-constant initializer and weight initializers were xavier / he.

Sample results are following.

batch_norm

Index Weight Initializer Batch Normalization Accuracy
xavier_woBN_0.9806 xavier Unused 0.9806
xavier_withBN_0.9812 xavier Used 0.9812
he_woBN_0.9811 he Unused 0.9811
he_withBN_0.9837 he Used 0.9837

Usage

python run_main.py --weight-init <weight initializer> --bias-init <bias initializer> --batch-norm <True or False>

<weight initializer> must be selected in [normal, truncated_normal, xavier, he].
<bias initializer> must be selected in [normal, zero].

You may command like python run_main.py --weight-init xavier --bias-init zero --batch-norm True.

Acknowledgement

This implementation has been tested on Tensorflow r0.12.

About

MNIST classification using Multi-Layer Perceptron (MLP) with 2 hidden layers. Some weight-initializers and batch-normalization are implemented.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.