GithubHelp home page GithubHelp logo

ysong10 / ticket-tagger Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rafaelkallis/ticket-tagger

0.0 1.0 1.0 45.03 MB

Machine learning driven issue classification bot.

Home Page: https://github.com/apps/ticket-tagger

License: MIT License

JavaScript 100.00%

ticket-tagger's Introduction

Ticket Tagger

Machine learning driven issue classification bot. Add to your repository now!

Development

notice:

  • nodejs ^8.3.x is required to compile/install dependencies
  • wget is required for fetching datasets

get started:

git clone https://github.com/rafaelkallis/ticket-tagger ticket-tagger
cd ticket-tacker

# install appropriate nodejs version
nvm install 8
nvm use 8

# compile/install dependencies
npm install

# fetch dataset
npm run dataset

# run benchmark
npm run benchmark

# run linter
npm run lint

# run tests
npm test

# run server
npm start

experiments:

For each experiment, we need a dataset that allows to test the stated hypothesis, as well as a baseline dataset which contains the same amount of labelled issues.

Does a repository specific dataset affect the model's performance?

# run baseline-issues benchmark
npm run dataset:vscode:baseline
npm run benchmark

# run vscode-issues benchmark
npm run dataset:vscode
npm run benchmark

Does a (spoken) language specific dataset affect the models perfomrnace?

# run baseline-issues benchmark
npm run dataset:english:baseline
npm run benchmark

# run english-issues benchmark
npm run dataset:english
npm run benchmark

Do code snippets affect the models perfomrnace?

# run baseline-issues benchmark
npm run dataset:nosnip:baseline
npm run benchmark

# run nosnip-issues benchmark
npm run dataset:nosnip
npm run benchmark

generate dataset:

A dataset (with 10k bugs, 10k enhancements and 10k questions) can be downloaded using npm run dataset. The dataset was generated using github archive's which can be accessed through google BigQuery.

Add the query below to your BigQuery console and adjust if needed (e.g., add __label__ prefix to labels, etc.).

SELECT
  label, CONCAT(title, ' ', REGEXP_REPLACE(body, '(\r|\n|\r\n)',' '))
FROM (
  SELECT
    LOWER(JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name')) AS label,
    JSON_EXTRACT_SCALAR(payload, '$.issue.title') AS title,
    JSON_EXTRACT_SCALAR(payload, '$.issue.body') AS body
  FROM
    [githubarchive:day.20180201],
    [githubarchive:day.20180202],
    [githubarchive:day.20180203],
    [githubarchive:day.20180204],
    [githubarchive:day.20180205]
  WHERE
    type = 'IssuesEvent'
    AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'closed' )
WHERE 
  (label = 'bug' OR label = 'enhancement' OR label = 'question')
  AND body != 'null';

run serverless app:

You need a .env file in order to run the github app. The file should look like this:

GITHUB_CERT=/path/to/cert.private-key.pem
GITHUB_SECRET=123456
GITHUB_APP_ID=123
PORT=3000

Note: When running app in production, environment variables should be provided by host.

references:

ticket-tagger's People

Contributors

rafaelkallis avatar

Watchers

 avatar

Forkers

dianfanguo2000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.