GithubHelp home page GithubHelp logo

flitternie / nngen Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tbabm/nngen

0.0 1.0 0.0 2.58 MB

NNGen, a simple baseline for commit message generation from diffs.

License: MIT License

Dockerfile 3.82% Python 34.09% Shell 2.66% Perl 59.43%

nngen's Introduction

NNGen

NNGen, proposed in [1] as a baseline for commit message generation from diffs.

Structure

  • data: our cleaned dataset
  • archive: the test results of NNGen and NMT [2] on the cleaned dataset
  • scripts:
    • multi-bleu.perl: used for calculating BLEU score
  • nngen.py: implementation of NNGen

Run

  • use python3
pip3 install fire numpy scipy nltk scikit-learn

# arguments: train_diff_file, train_msg_file, test_diff_file
python3 -m nngen main \
    ./data/cleaned.train.diff \
    ./data/cleaned.train.msg \
    ./data/cleaned.test.diff

# evaluate
./scripts/multi-bleu.perl ./data/cleaned.test.msg < ./nngen.cleaned.test.msg

Note

  • Our implementation assumes that:
    • each data file contains multiple lines and ends with an empty line
    • each line contains a diff or a commit message
  • Due to the update of the dependencies (i.e., nltk and scikit-learn), the performance of NNGen on the cleaned dataset may be a little different from the results reported in our paper [1]. For example, 16.41 vs 16.42 in terms of BLEU-4 score. If you want to reproduce our results completely, please see the Reproduce section.

Reproduce

To 100% reproduce the results we presented in [1], you need to install the requirements with given versions.

pip3 install -r requirements.txt

or you can use our built docker image. (But it may take more time to get the results.)

# install docker

docker pull tbabm/nngen:0.1

docker run -it --rm -v $(pwd):/root/nngen --name run-nngen nngen:0.1 \
       python3 -m nngen main \
               ./data/cleaned.train.diff \
               ./data/cleaned.train.msg \
               ./data/cleaned.test.diff

Enjoy!

Reference

[1] Liu Z, Xia X, Hassan A E, et al. Neural-machine-translation-based commit message generation: how far are we?[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 2018: 373-384.

[2] Jiang S, Armaly A, McMillan C. Automatically generating commit messages from diffs using neural machine translation[C]//Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 2017: 135-146.

nngen's People

Contributors

tbabm avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.