Hello, dear annotator! ( ͡° ͜ʖ ͡°)
This repository contains our code and data for the CS140 group project on sentiment analysis.
In ./data/comments_for_annotators/<number>
you will find 5 jsonl
files, each corresponding to a 2020 presidential candidate.
Each file contains <number>
comments, randomly sampled from the YouTube data set collected earlier in the semester. Note that potential values for <number>
are 100, 200, and 500.
Your mission, should you choose to accept it, is as follows:
- Fork this repository under your own name.
- Set up a Python 3 development environment, e.g. using
conda
. - Install dependencies by running
pip install -r requirements.txt
- Go to
./Server/
and runpython main.py
- In your web browser of choice, navigate to
https://localhost:5000
- Click
"Browse"
, navigate to./data/comments_for_annotators/<number>
, and choose the file<assigned_candidate>.<number>.jsonl
. Afterwards, click"Upload"
. - Wait to be redirected and annotate.
- If you need to revise an annotation, feel free to alter comment index in the URL to backward by your desired number of comments.
- After annotating, your annotation file will be
./Server/annotations.jsonl
- Rename your annotation file by running
mv annotations.jsonl <candidate-last-name>-annotated-<your-name>.jsonl
where<candidate-last-name>
should be replaced with a lowercased version of the candidate's name, e.g.biden
, and<your-name>
should be lowercased and hyphenated, e.g.louis-brandeis
. - Submit your annotations by committing to your own version of the repository, and submit a pull reqeuest to the original repository.
Good luck!! (•̀ᴗ•́)൬༉
Here is the schedule of candidate-annotator assignments:
Candidate | Annotator |
---|---|
Biden | Yonglin Wang |
Buttigieg | Xiaoyu Lu |
Sanders | Yonglin Wang |
Warren | Zhuoran Huang |
Yang | Xiaoyu Lu |
Hint: you can double-check the random assignments by running
python scripts/assign_annotators.py