This repository contains report and Python code of final project of Text Mining course.
This course introduces the methods of mining, organizing, summarizing, and analyzing text data with the objective of supporting decision-making.
- Text Mining Objectives and Business Applications
- Text Data Pre-Processing and Feature Engineering
- Text Classification
- Text Summarization and Topic Modeling
- Opinion Mining and Sentiment Analysis
- Topic: The objective of this project is to analyze customer feedback of Disneyland theme parks and identify the topics that are most important to customers and determine their overall sentiment towards different branches. The insights from this analysis could be used to improve Disneyland's attractions, services, and overall experience, identify areas for marketing and advertising
- Data Source: Database was selected from the Kaggle, Disneyland Reviews Dataset. The dataset includes 42,000 reviews of 3 Disneyland branches - Paris, California and Hong Kong, posted by visitors on Trip Advisor
- Framework: This project uses two text mining methods. The first is Latent Dirichlet Allocation (LDA) topic modeling to identify the main topics and concerns mentioned in the reviews for each of the three Disneyland parks. The second is sentiment analysis to determine the sentiment of each review and compare the sentiment tendencies towards different branches. And in the sentiment analysis we also compare the characteristics of customer reviews in different periods
- Result: By analyzing the data, we were able to find differences in the specific focus and sentiment of the reviews and conclude how to improve customer experience
Course Instructor: Yulia Nevskaya, Ph.D