Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Natural Language Processing for Multiclass Classification: A repository containing NLP techniques for multiclass classification of text data.

Notifications You must be signed in to change notification settings

Snigdho8869/Multiclass-Text-Classification

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multiclass Text Classification Project

Project Overview

The goal of this project is to classify text data into predefined categories using a combination of traditional machine learning models and deep learning architectures. The project includes:

  • A Flask-based web application for interactive text classification.
  • Preprocessing of text data, including cleaning, tokenization, and lemmatization.
  • Training and evaluation of multiple models, including:
    • Traditional ML models: Logistic Regression, SVM, Naive Bayes, Random Forest, Gradient Boosting, AdaBoost, and an Ensemble model.
    • Deep learning models: LSTM, GRU, CNN, and a hybrid LSTM+CNN model.
    • Fine-tuning of transformer-based models: BERT and XLNet using ktrain.
  • Visualization of results, including confusion matrices, accuracy plots, and word clouds.

Requirements:

  • Python

  • Scikit-learn

  • TensorFlow

  • Keras

Dataset:

The dataset used in this project is the bbc-tex dataset, which consists of approximately 2225 text.

Results:

The results of each model on the bbc-text dataset are as follows:

Model Accuracy
Logistic Regression 96.58%
Support Vector Machine 96.94%
Multinomial Naive Bayes 94.97%
Randomforest 95.15%
GradientBoostingClassifier 94.25%
Ensemble Classifier 97.12%
AdaBoost 94.43%
LSTM 1-Layer 99.22%
LSTM 2-Layers 97.78%
GRU 91.74%
CNN+LSTM 98.73%
BERT 99.60%
XLNet 99.46%

Application Interface

Original Image

Morty Proxy This is a proxified and sanitized view of the page, visit original site.