suubh
diff --git a/‎README.md
Copy file name to clipboardExpand all lines: README.md
+19-6Lines changed: 19 additions & 6 deletions b/‎README.md
Copy file name to clipboardExpand all lines: README.md
+19-6Lines changed: 19 additions & 6 deletions
@@ -14,7 +14,7 @@ The following libraries are required to successfully implement the projects.
 The projects are divided into various categories listed below -
 
 ## Supervised Learning 
-  - [##**Linear Regression**]()
+  - [**Linear Regression**]()
      - [Linear Regression Single Variables.](https://github.com/suubh/Machine-Learning-in-Python/blob/master/Linear%20Regression/LinearRegressionSingle%20Variables.ipynb) : A Simple Linear Regression Model to model the linear relationship between Population and Profit for plot sales.
      - [Linear Regression Multiple Variables.](https://github.com/suubh/Machine-Learning-in-Python/blob/master/Linear%20Regression/LinearRegressionMultipleVariables.ipynb) : In this project, I build a Linear Regression Model for multiple variables for predicting the House price based on acres and number of rooms.
 
@@ -31,13 +31,26 @@ The projects are divided into various categories listed below -
   - [**Random Forest Classification**](https://github.com/suubh/Machine-Learning-in-Python/blob/master/RandomForest/RandomForest.ipynb) : In this project I used Random Forest Classifier (90.0%) and Random Forest Regressor (61.8%) on the Social Network Ads dataset. 
 
 ## Unsupervised Learning 
-  - [**K Means Clustering**](https://github.com/suubh/Machine-Learning-in-Python/blob/master/K-means/creditcard.ipynb) : K-Means clustering is used to find intrinsic groups within the unlabelled dataset and draw inferences.It is one of the most detailed projects, In this project, I implement K-Means Clustering  on Credit Card Dataset to cluster different credit card users based on the features.I scaled the data using `StandardScaler` because normalizing will improves the convergence.I also implemented the [*Elbow Method*](https://en.wikipedia.org/wiki/Elbow_method_(clustering)) to search for the best numbers of clusters.For visualizing the dataset I used [*PCA(Principal Component Analysis)*](https://en.wikipedia.org/wiki/Principal_component_analysis) for dimensionality reduction as the dataset features were large in number.In the end I used [*Silhouette Score*]() which is used to calculate the performance of clustering . It ranges from -1 to 1 and I got a score of 0.203.
-
-## NLP(Natural Language Processing)
-  - [Text Analytics]()
-  - [Sentiment Analysis]()
+  - [**K Means Clustering**](https://github.com/suubh/Machine-Learning-in-Python/blob/master/K-means/creditcard.ipynb) : K-Means clustering is used to find intrinsic groups within the unlabelled dataset and draw inferences.It is one of the most detailed projects, In this project, I implement K-Means Clustering  on Credit Card Dataset to cluster different credit card users based on the features.I scaled the data using `StandardScaler` because normalizing(scale in range 0 to 1) will improves the convergence.I also implemented the [*Elbow Method*](https://en.wikipedia.org/wiki/Elbow_method_(clustering)) to search for the best numbers of clusters.For visualizing the dataset I used [*PCA(Principal Component Analysis)*](https://en.wikipedia.org/wiki/Principal_component_analysis) for dimensionality reduction as the dataset features were large in number.In the end I used [*Silhouette Score*]() which is used to calculate the performance of clustering . It ranges from -1 to 1 and I got a score of 0.203.
+
+## NLP( Natural Language Processing )
+  - [Text Analytics](https://github.com/suubh/Machine-Learning-in-Python/blob/master/TextAnalytics/textAnalytics.ipynb) : It is a project for Introduction to Text Analytics in NLP.I performed the important steps -
+    - ***Tokenization***
+    - ***Removal of Special Char***
+    - ***Lower Case***
+    - ***Removing StopWords***
+    - ***Stemming*** 
+    - ***Count Vectorizer*** (Which generally performs all the steps mentioned above except Stemming)
+    - ***DTM (Document Term Matrix)***
+    - ***TF-IDF (Text Frequency Inverse Document Frequency)***
+    
+  - [Sentiment Analysis](https://github.com/suubh/Machine-Learning-in-Python/tree/master/Sentiment%20Analysis) : I applied Sentiment analysis in MovieReview (Dataset from nltk library) and RestaurentReview Datasets to predict the positive and negative review . I used Naive Bayes Classifier (78.8%) and Logistic Regression (84.3%) to build the models and for prediction. 
 
 ## Data Cleaning and Preprocessing
+  - [Data Preprocessing](https://github.com/suubh/Machine-Learning-in-Python/blob/master/Data%20Preprocessing/Untitled.ipynb) : I perform various data preprocessin and cleaning methods which are mentioned below -
+    - ***Label Encoding*** : It converts each category into a unique numeric value ranging from 0 to n(size of dataset).
+    - ***Ordinal Encoding*** : Categories to ordered numerical values.
+    - ***One Hot Encoding*** : It creates a dummy variable with value 0 to n(unique value count in the column) for each category value.Extra columns are created.