nlp-assignment's Introduction

Authorship Attribution with Machine Learning 📚🤖

Welcome to the GitHub repository for the Authorship Attribution project, where machine learning meets linguistic analysis! This project is all about classifying authors of texts using their unique writing styles. The dataset comprises texts from six different authors, making it a supervised learning challenge with a twist of linguistics.

Project Overview 🌟

Authorship Attribution is the process of identifying the author of a text based on their unique writing style or 'fingerprint'. This project is split into two main parts:

Data Cleaning and Feature Engineering - Where we prepare the text data and extract meaningful features that capture the essence of each author's style.
Model Training and Evaluation - Where various machine learning models are trained and evaluated to find the one that best identifies the authors.

Repository Contents 📁

Part1_Data_Cleaning_and_Feature_Engineering.ipynb
Part2_Model_Training_and_Evaluation.ipynb
cleaned_data.csv
mwe_tokenizer.pkl
Assignment_Data (folder containing dataset)

Technologies Used 💻

Python
Pandas & NumPy for data manipulation
NLTK for natural language processing
scikit-learn for machine learning
Matplotlib & Seaborn for visualization
Jupyter Notebook for interactive development

Performance Highlight: 95% F1 Score on Stratified 5-Fold Cross-Validation 🏅📈

The pinnacle of success in this Authorship Attribution project is the remarkable achievement of a 95% F1 score, meticulously obtained through Stratified 5-Fold Cross-Validation. This exceptional result is far more than a mere indicator of accuracy; it's a compelling evidence of the model's robustness and its consistent performance across diverse data subsets.

Recommend Projects

gangula-karthik / nlp-assignment Goto Github PK

nlp-assignment's Introduction

Authorship Attribution with Machine Learning 📚🤖

Project Overview 🌟

Repository Contents 📁

Technologies Used 💻

Performance Highlight: 95% F1 Score on Stratified 5-Fold Cross-Validation 🏅📈

nlp-assignment's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs