prg-titech / dl_code_completion_paper Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 9.39 MB

TeX 100.00%

dl_code_completion_paper's People

Contributors

Watchers

dl_code_completion_paper's Issues

図6・7が小さすぎる (査読者B)

Minor: Fig 6 and 7 are too small and difficult to read. please modify
these figures.

typo

a. we hereafter call predictions -> we hereafter call them predictions

b. 4.3 NT2V -> TT2V

split background into the background and the related work

background
- neural code completion
- three encoding method
- Word2Vec
related work
- anything else

https://github.com/prg-titech/dl_code_completion_paper/blob/master/MyPaper.tex#L91

POPL'19論文との違い (査読者B)

The following recent paper proposes a
similar technique for representing AST.
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019.
code2vec: learning distributed representations of code. Proc. ACM
Program. Lang. 3, POPL, Article 40 (January 2019), 29 pages. DOI:
https://doi.org/10.1145/3290353
Please add discussions comparing ASTToken2Vec and this study for
differences, potential issues, and so on.

提案の何が改善につながったのかが不明 (査読者B)

From the current evaluation, it is not clear how each structure,
such as NT2V and TT2V, contribute to the improvement of the final
performance.

図1の位置付け、説明を改善

d. I do not think that Figure 1 shows ASTToken2Vec LSTM integration
model. A current token is always given from the code repository?
In real usage, test set is always needed? This figure might show
the model and experimental design. I recommend revise it.

Change the order of the names in English to <first name> <family name>

図4の説明を追加

c. The explanation of Figure 4 must be added. Which module is an
input or an output layer?

単語の類似性についての議論、AST構造情報は既存研究

(Moreover, the use case shown in Figure 8 gave me a hint that
denotes the characteristics of two field “pageX” and “pageY”.
However, I do not figure out something behind the idea of dealing
with these types distinctively.)

Why do the authors consider information on AST tokens into the construction of the LSTM models is used in order to predict a next AST token?

What is the similarity of tokens in the context in which code completion in conventional LSTM models and the proposed new one?

Is it quite hard to discuss this on deep-leaning based code completion?

Additionally, the paper should present a motivation the authors have and an example complementing it in an early section (before describing the details of the models).

From: Daisaku Yokoyama [email protected]
Subject: 論文原稿 19-CONF-05 についての照会 (第 1 回)
To: [email protected]
Cc: YOKOYAMADAISAKU [email protected]
Date: Tue, 12 Nov 2019 22:33:27 +0900 (3 days, 15 hours, 35 minutes ago)

Dongfang Li様、Hidehiko Masuhara様、

拝啓時下ますますご健勝のこととお慶び申しあげます。

さて、このたび、日本ソフトウェア科学会「コンピュータソフトウェア」にご
投稿いただきました論文

論文番号：19-CONF-05
著者：Dongfang Li　 Hidehiko Masuhara　
論文題目：ASTToken2Vec: an Embedding Method for Neural Code Completion

の査読者から、照会後判定との中間報告がありました。つきましては、添付の
照会事項について、

３箇月後（2020 年 02 月 11 日）

までに、下記担当編集委員あてにご回答いただきたいと存じます。

論文原稿を改訂した場合には改訂版最新原稿（変更箇所を回答書に明記）を、
改訂のない場合には投稿時の原稿をあわせてお送り願います。回答書をプレイ
ンテキストまたは PDF 形式で、また論文原稿を PDF 形式でお送りください。

敬具

2019 年 11 月 12 日

日本ソフトウェア科学会編集委員
横山大作
明治大学理工学部情報科学科
[email protected]

========================================

査読者Ａ

1 回目査読結果
判定：照会

Summary:
This paper proposes ASTTokenVec that exploits information on types
of AST token (terminal and non-terminal) to improve the success rate
of a LSTM-based code completion. The experimental results with
150,000 JavaScript files show that the proposed implementation
embedding ASTTokenVec is superior to that not embedding it.

Evaluation:
I found this paper interesting. It has a scientific contribution
that demonstrates the improvement of existing LSTM-based code
completion. The idea of introducing AST token types is feasible.

With respect to the following two aspects, solid answers and
revising based on the answers are required for the acceptance
of the paper.

Although the paper occupies much space for describing the proposed
model about ASTToken2Vec, it is not easy for me (and many readers)
to understand. I know that an AST consists of terminal and
non-terminal
tokens. Moreover, the use case shown in Figure 8 gave me a hint that
denotes the characteristics of two field “pageX” and “pageY”.
However, I do not figure out something behind the idea of dealing (→ #6)
with these types distinctively. Why do the authors consider
information on AST tokens into the construction of the LSTM models
is used in order to predict a next AST token? What is the similarity
of tokens in the context in which code completion in conventional
LSTM models and the proposed new one? Is it quite hard to discuss
this on deep-leaning based code completion? At least, I think that
the paper should provide furthermore discussion on the experimental
results. Additionally, the paper should present a motivation the
authors have and an example complementing it in an early section
(before describing the details of the models).
I wonder if the experimental results truly show the improvement (→ #7)
of the prediction accuracy. Surely, the accuracy of the proposed
method is 1.5%-point or 3.1%-point better. Does this always claim?
Do the results depend on the values of several parameters that the
authors decided? The section of “Threats to Validity” must provide
enough information on factors that affect th experimental results.

Minor comments:
a. we hereafter call predictions -> we hereafter call them predictions (#8)
b. 4.3 NT2V -> TT2V (#8)
c. The explanation of Figure 4 must be added. Which module is an
input or an output layer? (#9)
d. I do not think that Figure 1 shows ASTToken2Vec LSTM integration (#10)
model. A current token is always given from the code repository?
In real usage, test set is always needed? This figure might show
the model and experimental design. I recommend revise it.
e. The Japanese title might not denote the contents of the paper. (#11)

========================================

査読者Ｂ

1 回目査読結果
判定：採録（採録条件コメントあり）

This paper proposes ASTToken2Vec, a technique for neural network based
code completion with a vector representation of program tokens in
abstract syntax trees. Experimental comparison with 150,000 JavaScript
program files shows that ASTToken2Vec outperforms the baseline of a
LSTM model.

Strengths:

The proposal is technically reasonable and sound.
The experiment is conducted with a publicly available dataset.
The proposal is applied to a large-scale data.
The paper is easy to read.

Weaknesses:

The performance improvement is not quite big.
From the current evaluation, it is not clear how each structure, (#12)
such as NT2V and TT2V, contribute to the improvement of the final
performance.

Considering the above strengths, I consider the paper has enough
contributions to the community with technical originality,
effectiveness, and readability.

《採録条件コメント》
I have one comment for revision. The following recent paper proposes a (#13)
similar technique for representing AST.
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019.
code2vec: learning distributed representations of code. Proc. ACM
Program. Lang. 3, POPL, Article 40 (January 2019), 29 pages. DOI:
https://doi.org/10.1145/3290353
Please add discussions comparing ASTToken2Vec and this study for
differences, potential issues, and so on.

《その他論文改善のための参考意見》
Minor: Fig 6 and 7 are too small and difficult to read. please modify (#14)
these figures.

===============

実験結果の信頼性 (査読者A第2点)

I wonder if the experimental results truly show the improvement
of the prediction accuracy. Surely, the accuracy of the proposed
method is 1.5%-point or 3.1%-point better. Does this always claim?
Do the results depend on the values of several parameters that the
authors decided? The section of “Threats to Validity” must provide
enough information on factors that affect th experimental results.

commenst by Paul

A roadmap for the paper is welcome. For me, It was not easy to know start the proposal of this paper.
A section without an introduction (eg. “Experiments [NO TEXT] 6. 1 Dataset”) confuses me (and loss motivation) because I cannot know why I am reading that.
Figure 3. Some texts are vertical and other horizontal; it looks strange.
Gray scale for all figure may be welcome for “poor reviewers without color printer”
I would recommend to change the title “2 Background” to another more specific.
The title footnote (I guess) has a Japanese text.
“Tung et al. [9] extends” -> without “s”?
Use top ([t]) for figures?
Should Section 3 be a subsection in Section 2?

日本語論文題目の見直し

e. The Japanese title might not denote the contents of the paper.

change style file for journal submission

Currently the manuscript is formatted for the proceedings of the conference, which should be changed to the journal style. It can be done by just changing an option.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble