GithubHelp home page GithubHelp logo

harrychangjr / sp1541-nlp Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 4.07 MB

NLP Project using past essays from SP1541: Exploring Science Communication through Popular Science

Home Page: https://sp1541-nlp.streamlit.app/

Python 0.95% Jupyter Notebook 99.05%
nltk-python python streamlit textstat

sp1541-nlp's Introduction

sp1541-nlp

NLP Project using past essays from SP1541: Exploring Science Communication through Popular Science in Academic Year 2020/21 Semester 1

For context, I took a module in Academic Year 2020/21 Semester 1 - SP1541: Exploring Science Communication in Popular Science, where I had to submit 2 news articles for grading.

The first article, titled Timing vaccination campaign to reduce measles infections - is related to my academic discipline, and revolves mainly around mathematics.

The second article, titled Investigating the relationship between culture and sweet-sour taste interactions - is not related to my academic discipline, and is based on the science of chemistry.

Unfortunately, I scored below average for both articles, as I presumed that as a freshman back then, I did not undergo sufficient training to communicate complex scientific concepts well to the layman audience.

With the introduction of ChatGPT however, I took this opportunity to see if this AI tool could optimise my initial write-ups. The following articles/texts will hence be used for this analysis, as described below:

| Text_id  | Description                            |
|----------|----------------------------------------|
| 1a       | News Article 1 - Original              |
| 1b       | News Article 1 - Optimised (Min)       |
| 1c       | News Article 1 - Optimised (Max)       |
| 2a       | News Article 2 - Original              |
| 2b       | News Article 2 - Optimised (Min)       |
| 2c       | News Article 2 - Optimised (Max)       |

For submission, the word limits of the 2 articles are 800 and 1000 respectively. For each article, 2 other variants were produced, namely:

  • "b" series - using ChatGPT to summarise the original article with as few words as possible (~400 words)
  • "c" series - using ChatGPT to stick to the original word limit(s), while enhancing the language and expression of the article text where applicable

Using various libraries in Python including matplotlib, seaborn, nltk, textstat and wordcloud, we will hence perform detailed comparisons to evaluate if ChatGPT has indeed enhanced or reduced the quality of the original articles.

Three main methods will be used for this analysis:

  1. Preliminary analysis - comparing word counts, readability scores and sentiment (compound) scores
  2. Creating word clouds to identify most frequently used words from each article
  3. Identifying top 10 words within each article series

Summary of results

Preliminary analysis

Using ChatGPT in an attempt to optimise the original articles resulted in:

  • Decreased in Flesch reading scores (aka readability)
  • Slight increase or maintenance of sentiment compound scores (positive tone)

Top words used among each article series

  • Variants of News Article 1: measles, vaccination, campaign, Pakistan, cases, infections, health, November, disease, children
  • Variants of News Article 2: taste, sweetness, sourness, sucrose, sensitivities, study, consumers, danish, acid, Chinese
  • Across variants from both articles: study, researchers, may, one, could, results, 2019

References

News Article 1

News Article 2

sp1541-nlp's People

Contributors

harrychangjr avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.