mangipudiprashanth7 / twitter_sentiment_analysis-using-ml-and-nlp Goto Github PK

View Code? Open in Web Editor NEW

The objective of this project is to detect hate speech in tweets. For the sake of simplicity, we say a tweet contains hate speech if it has a racist or sexist sentiment associated with it. So, the task is to classify racist or sexist tweets from other tweets.

License: MIT License

Jupyter Notebook 100.00%

twitter_sentiment_analysis-using-ml-and-nlp's Introduction

Twitter Sentiment Analysis

If you like my work, do leave a star ✨. I'd more than happy if you give any receive suggestions to improve my work!!

In this project various techniques are used. The basic concepts such as data cleaning, data transformation are also explained very clearly.

Word2Vec features turned out to be most useful. Whereas XGBoost with Word2Vec features was the best model for this problem. This clearly shows the power of word embeddings in dealing with NLP problems.

WHAT ELSE CAN BE TRIED? We have covered a lot in this Sentiment Analysis project, but still there is plenty of room for other things to try out. Given below is a list of tasks that you can try with this data.

1.We have built so many models in this, we can definitely try model ensembling. A simple ensemble of all the submission files (maximum voting) yielded an F1 score of 0.55 on the public leaderboard.

2.Use Parts-of-Speech tagging to create new features.

3.Use stemming and/or lemmatization. It might help in getting rid of unnecessary words.

4.Use bi-grams or tri-grams (tokens of 2 or 3 words respectively) for Bag-of-Words and TF-IDF.

5.We can give pretrained word-embeddings models a try.

Credits✨

Analytics Vidhya Sentiment Analysis Project! 🎈

Recommend Projects

mangipudiprashanth7 / twitter_sentiment_analysis-using-ml-and-nlp Goto Github PK