GithubHelp home page GithubHelp logo

phymucs / teddy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from megagonlabs/teddy

0.0 1.0 0.0 85.72 MB

A system for interactive review analysis.

License: Apache License 2.0

Makefile 1.24% Python 82.90% HTML 2.94% CSS 12.11% JavaScript 0.40% Shell 0.41%

teddy's Introduction

Teddy, the Review Explorer

This page contains the source code and supplementary material for our CHI 2020 paper: "Teddy: A System for Interactive Review Analysis".

  1. Introduction
  2. Motivation: An Interview Study into Review Analysis Practices and Challenges
  3. How to use the data and source code in this repo?
  4. The Dataset
  5. Citing Teddy
  6. Contact

Introduction

Teddy (Text Exploration for Diving into Data deeplY) is an interactive system that enables data scientists to quickly obtain insights from reviews and improve their extraction and modeling pipelines. Please watch the demo video for an overview of the system and our contributions.

You can also try Teddy online here!

Above: The Teddy user interface. From left to right we have the Entity View displaying the entities mentioned in reviews, the Cluster View for exploring aggregate statistics over hierarchical clusters of reviews, the Detail View for viewing and filtering/sorting individual reviews, and the Schema Generation View for recording aspects of interest from the reviews.



Above: The Teddy review exploration pipeline. Users can customize the data processing pipeline based on their task, whether it is classification, opinion extraction, or representation learning, and use Teddy to gain insights about their data and model. They can also use the application to iterate on the data processing pipeline, for example by creating a new schema that describes attributes of their review corpus.

Motivation: An Interview Study into Review Analysis Practices and Challenges

We conducted an interview study with fourteen participants to better understand the workflows and rate-limiting tasks of data scientists working on reviews, which motivated the development of features in Teddy. We used an iterative coding method to aggregate the collected data.

Download the results of our iterative coding here.

Anonymized notes from individual interviews and our interview question template are also available in the results/ folder.

How to use the data and source code in this repo?

Important Folders

  • app/ server and front-end code
  • data/ subdirectories containing Trip Advisor data or your own datasets
  • libs/ python libraries for data processing
  • tests/ testing code for the code in libs/

Local Installation

Teddy requires Python 3.5 or above. Make sure you have venv installed. If you don't, run python3 -m pip install virtualenv Copy the contents of /app/react-app/.env.example to /app/react-app/.env

# Install dependencies
make install ENV=local
# Build dependencies
make build
# These will automatically run in a virtual environment called 'venv'

API Keys (Optional)

Teddy requires Google API Keys in order to render the map and the hotel images. Please refer to Google Maps Platform on how to get an API Key, and enable the Maps JavaScript API and the Places API.

Running the Application

# start the backend server
make server
# start the user interface
make ui

Then navigate to http://localhost:3000 in your browser.

The Dataset

The reviews we provide in order to demonstrate the application are provided by Trip Advisor under the Creative Commons Attribution Non-Commercial 4.0 International License.

(Barkha Bansal. (2018). TripAdvisor Hotel Review Dataset. Zenodo. http://doi.org/10.5281/zenodo.1219899).

A subset of the reviews for San Francisco hotels have been selected and modified by (1) computing extractions of aspect, opinion pairs and (2) clustering and computing statistics over those clusters.

Some of the icons used in our application are made by Freepik and can be found at www.flaticon.com.

Citing Teddy

Please cite the CHI paper.

@inproceedings{zhang2020teddy,
 title = {Teddy: A System for Interactive Review Analysis},
 author = {
   Xiong Zhang AND
   Jonathan Engel AND
   Sara Evensen AND
   Yuliang Li AND
   {\c{C}}a{\u{g}}atay Demiralp AND 
   Wang-Chiew Tan
   }, 
 booktitle = {ACM Human Factors in Computing Systems (CHI)},
 year = {2020}
}

Contact

To get help with problems using Teddy or replicating our results, please submit a GitHub issue.

For personal communication related to Teddy, please contact Jonathan Engel ([email protected]), Sara Evensen ([email protected]), or Çağatay Demiralp ([email protected]).

teddy's People

Contributors

cagataydemiralp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.