Visualize respone rates from text data (blogs, twitter, etc.)
pip install -r requirements.txt
- For datasets with sentences or paragraphs per cell, use
schedulesngrams.py
. For datasets where it's just one word per cell, useschedulesfavorites.py
- Make sure your CSV file's column name starts with 'tweet' if using
schedulesngrams.py
(or 'user' if usingschedulesfavorites.py
) and that there's a column called 'date' with the date formatted as follows:2012-03-24
- Make sure the columns are sorted by descending order (latest dates first).
Originally, I made this for twitter data, but of course it also works for blogs. For example, here's my analysis of the amazing systems scientist George Mobus' blog.
Extremely roughly, you could say 'the steeper the lines, the higher the "interest" '
Plateuing lines means "interest" is decaying.
But this is oversimplifying it.
Disclaimer: There's a list of words in the python script that I have added to exclude (such as stop words, but I've also included words which were not stop words which to me did not seem to convey much information in most cases as I was using this script. If you'd like to keep all words, I suggest looking up "toremove" and replacing those giant lists of words with an empty array.