GithubHelp home page GithubHelp logo

mathewsrc / topic-modeling-reclame-aqui Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 288.81 MB

Comparison of Topic Modeling Approaches on Complaints Related to the E-Commerce Industry

License: Creative Commons Zero v1.0 Universal

Jupyter Notebook 1.09% Python 0.01% HTML 98.91% Makefile 0.01%
colab-notebook python topic-modeling webscraping natural-language-processing bertopic lda lsi

topic-modeling-reclame-aqui's Introduction

Topic Modeling Reclame Aqui

Title: Comparison of Topic Modeling Approaches on Complaints Related to the E-Commerce Industry

Abstract

The electronics sector faces a great challenge in dealing with a large volume of consumer complaints, which can have various origins, from issues such as delivery time to problems with service cancellations. These complaints can negatively impact a company's reputation and affect its ability to retain and attract new customers. Topic modeling becomes fundamental in this context because it can help companies better understand and categorize consumer complaints in an automated way. In this context, this project aimed to compare and evaluate the performance of three popular topic modeling algorithms: LDA, BERTopic, and LSI, in extracting topics from customer complaints on the Brazilian website "Reclame Aqui" and identifying relevant topics. The project used a dataset of approximately twelve thousand customer complaints related to products and services from the major e-commerce platforms in Brazil. The data was preprocessed and used in training and optimizing topic modeling models, and the performance was evaluated based on the average coherence value. The evaluation results showed that the LDA and BERTopic models were able to extract informative topics from the data. In terms of score, BERTopic had the best performance among the models using the average coherence value as a comparison metric. The findings of this project suggest that companies seeking to obtain valuable information related to customer complaints can benefit from the use of topic modeling algorithms, such as LDA and BERTopic, to better understand their customers' concerns and take data-driven actions to improve customer satisfaction.

Keywords: Transformers; Latent Dirichlet Allocation; Latent Semantic Indexing; BERTopic; Latent Semantic Indexing

Overview

image

Results

LDA - Intertopic Distance map

LDA - Terms relevance in each topic

BERTopic - Intertopic Distance map

BERTopic - Intertopic Distance map

Models comparison (LSI, LDA, and BERTopic) using coherence measure

Conclusion

In conclusion, the present project revealed that topic modeling is a highly effective technique for extracting structured information from large customer complaints data sets. Notably, the BERTopic and LDA models were able to identify the main reasons underlying complaints related to e-commerce in Brazil, highlighting, with relevance, themes relating to problems with credit cards, payments, delays in deliveries, delivery times, and returns, as well as problems arising from direct service with companies. On the other hand, the LSI model demonstrated limitations in separating complaints into topics with precise themes. The analysis of the terms of the generated topics indicated the overlap of many terms, such as "store", "value" and "deadline", in different topics, which suggests the need for further investigation. The results obtained indicate that, by identifying the most common concerns and problems among customers, companies can better understand the reasons underlying customer dissatisfaction and take effective measures to improve their products or services, which, in turn, can lead to better customer relationships and satisfaction. Furthermore, the findings of this study highlight that topic modeling can help companies deal more effectively with customer complaints, balancing the impact on their market value and reputation, and promoting data-driven decision-making to reduce customer attrition and therefore improve the customer experience. In summary, it is expected that the contributions of this study can serve as a basis for the development of future research in the area of ​​topic modeling applied to consumer feedback and for the development of practical solutions that can be used by companies interested in improving the experience of its consumers and that can contribute to future work involving topic modeling in different areas of interest. The LDA and BERTopic models proved to be models with excellent performance in extracting thematic terms from complaints, opening a path of opportunities for new applications. BERTopic offers a wide range of uses such as topic evolution analysis over time and incremental topic modeling, which were not explored in this project and which could bring even better results in understanding the problems present in complaints.

topic-modeling-reclame-aqui's People

Contributors

mathewsrc avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.