Comments (2)
- After you did run the process data script, the 17 batch files are test set. And the number of files does not matter.
- Although there is no any feedback, I think it's training. You can sample some examples (such as 1000, now the number of training examples is 824342. ) from training file and validation file to speed up the detection of whether the code is correct. And you can try to print something during traing: https://github.com/microsoft/CodeBERT/blob/master/CodeBERT/codesearch/run_classifier.py#L98-L105
- You can change the transformers version to 2.5.0 (the version I used) and try it again.
from codebert.
Switching to transformers==2.5.0 didn't fix the problem.
I will try my luck with the CodeXGlue pipeline as it take less time to train anyway https://github.com/microsoft/CodeXGLUE/tree/main/Text-Code/NL-code-search-WebQuery
Thank you very much for your help.
from codebert.
Related Issues (20)
- AST code availability HOT 1
- Embedding
- Token embedding with CodeBER, UniXcoder or LongCoder HOT 1
- Issues in CodeRefinement dataset
- request for raw results of CodeReviewer HOT 1
- request for fine-tuned checkpoint of CodeReviewer model HOT 7
- How long does it take to train the code2nl model in the codebert folder? HOT 1
- finetune-msg.sh no step to generate checkpoints? HOT 2
- CodeReviewer: Metadata for downloading github repos HOT 2
- done. HOT 1
- 如何进行文本与代码的匹配? HOT 1
- The Code Reviewer fine-tuning script freezes on multiprocessor functions on Windows. HOT 2
- Sharing human evaluation results for CodeReviewer (informativeness and relevance) HOT 3
- Missing Appendix in CodeReview Paper HOT 2
- Code completion with >=2 masks
- 关于训练时模型的突然失效问题(training loss暴涨,training ppl暴涨)
- Question about CodeReviewer:Does the order of input diff-lines can influence the outcome?
- Questions about additional C/C++ training dataset HOT 1
- Request for Fine-Tuned GraphCodeBert Model for Code Clone Detection
- Questions about LCC dataset license
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from codebert.