Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Latest commit

 

History

History
History
166 lines (129 loc) · 4.67 KB

File metadata and controls

166 lines (129 loc) · 4.67 KB
Copy raw file
Download raw file
Outline
Edit and raw actions
jupyter
jupytext kernelspec plotly
notebook_metadata_filter text_representation
all
extension format_name format_version jupytext_version
.md
markdown
1.1
1.1.1
display_name language name
Python 2
python
python2
description display_as has_thumbnail language layout name order page_type permalink thumbnail
Learn how to normalize data by fitting to intervals on the real line and dividing by a constant
mathematics
false
python
base
Normalization
2
example_index
python/normalization/
/images/static-image

New to Plotly?

Plotly's Python library is free and open source! Get started by dowloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Imports

The tutorial below imports NumPy, Pandas, and SciPy.

import plotly.plotly as py
import plotly.graph_objs as go
import plotly.tools as tools
from plotly.tools import FigureFactory as FF

import numpy as np
import pandas as pd
import scipy

Import Data

To properly visualize our data and normalization, let us import a dataset of Apple Stock prices in 2014:

apple_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_apple_stock.csv')
df = apple_data[0:10]

table = FF.create_table(df)
py.iplot(table, filename='apple-data-sample')

Normalize by a Constant

Normalize a dataset by dividing each data point by a constant, such as the standard deviation of the data.

data = apple_data['AAPL_y']

data_norm_by_std = [number/scipy.std(data) for number in data]

trace1 = go.Histogram(
    x=data,
    opacity=0.75,
    name='data'
)

trace2 = go.Histogram(
    x=data_norm_by_std,
    opacity=0.75,
    name='normalized by std = ' + str(scipy.std(data)),
)

fig = tools.make_subplots(rows=2, cols=1)

fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)

fig['layout'].update(height=600, width=800, title='Normalize by a Constant')
py.iplot(fig, filename='apple-data-normalize-constant')

Normalize to [0, 1]

Normalize a dataset by dividing each data point by the norm of the dataset.

data_norm_to_0_1 = [number/scipy.linalg.norm(data) for number in data]

trace1 = go.Histogram(
    x=data,
    opacity=0.75,
    name='data',
)

trace2 = go.Histogram(
    x=data_norm_to_0_1,
    opacity=0.75,
    name='normalized to [0,1]',
)

fig = tools.make_subplots(rows=2, cols=1)

fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)

fig['layout'].update(height=600, width=800, title='Normalize to [0,1]')
py.iplot(fig, filename='apple-data-normalize-0-1')

Normalizing to any Interval

Normalize a dataset to an interval [a, b] where a, b are real numbers.

a = 10
b = 50
data_norm_to_a_b = [(number - a)/(b - a) for number in data]

trace1 = go.Histogram(
    x=data,
    opacity=0.75,
    name='data',
)

trace2 = go.Histogram(
    x=data_norm_to_a_b,
    opacity=0.75,
    name='normalized to [10,50]',
)

fig = tools.make_subplots(rows=2, cols=1)

fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)

fig['layout'].update(height=600, width=800, title='Normalize to [10,50]')
py.iplot(fig, filename='apple-data-normalize-a-b')
from IPython.display import display, HTML

display(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />'))
display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">'))

! pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
    'python_Normalization.ipynb', 'python/normalization/', 'Normalization | plotly',
    'Learn how to normalize data by fitting to intervals on the real line and dividing by a constant',
    title='Normalization in Python. | plotly',
    name='Normalization',
    language='python',
    page_type='example_index', has_thumbnail='false', display_as='mathematics', order=2,
    ipynb= '~notebook_demo/103')
Morty Proxy This is a proxified and sanitized view of the page, visit original site.