Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

sudam802/NLP_tutorials

Open more actions menu

Repository files navigation

NLP_tutorials

Natural language processing tutorials

The provided code is a machine learning pipeline to classify disaster-related tweets. It begins by importing necessary libraries and loading the training and test datasets. The tweets are preprocessed by removing articles ("a", "an", "the") using a regular expression. A CountVectorizer is then used to convert the text data into a matrix of token counts. The RidgeClassifier model is trained on this transformed data, and its performance is evaluated using 3-fold cross-validation with the F1 scoring metric. Finally, the model is used to make predictions on the test dataset, and an example tweet is printed to demonstrate preprocessing.

About

Natural language processing tutorials

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.