GithubHelp home page GithubHelp logo

squeezeailab / llm2llm Goto Github PK

View Code? Open in Web Editor NEW
108.0 6.0 8.0 214 KB

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Home Page: https://arxiv.org/abs/2403.15042

License: MIT License

Python 100.00%
data-augmentation llama llama2 llm llms natural-language-processing nlp synthetic-dataset-generation transformer

llm2llm's Introduction

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [Paper]

Thumbnail

This is the code for the LLM2LLM paper.

Reproducing Main Experiments

We have provided code required to reproduce our main experiments for GSM8K. Instructions for other datasets will be uploaded soon.

  1. Download a copy of LLaMA-2-7B, and the appropriate dataset
  2. Clone the GSM8K dataset by running
cd GSM8K
git clone https://github.com/openai/grade-school-math.git
  1. Run generate_seed_data.py and adjust SUBSAMPLE_SPLIT to get seed data.
  2. Ensure that all settings in config.yaml are accurate
  3. Run python GSM8K/generator_data.py GSM8K/config.yaml
  4. cd into your experiment folder and run ./run_all.sh
  5. After all of the iterations have finished, run
python report_results.py --results_file_name test_0.jsonl GSM8K/grade-school-math/grade_school_math/data/test.jsonl $EXP_FOLDER

to get a detailed breakdown of the performance of the model at each iteration.

This will produce an output folder that contains all the data and model checkpoints.

Roadmap

We are planning on adding the code required to reproduce our experiments on other datasets.

Citation

LLM2LLM has been developed as part of the following paper. We would appreciate if you would please cite this paper if you found this library useful for your work:

@article{lee2024llm2llm,
      title={LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement}, 
      author={Lee, Nicholas and Wattanawong, Thanakul and Kim, Sehoon and Mangalam, Karttikeya and Shen, Sheng and Anumanchipali, Gopala and Mahoney, Michael W and Keutzer, Kurt and Gholami, Amir},
      journel={arXiv},
      year={2024},
}

llm2llm's People

Contributors

dragon18456 avatar

Stargazers

sohn avatar  avatar  avatar  avatar  avatar 陈醇 avatar Alex Zhang avatar  avatar Tyunsen avatar  avatar Yihao Feng avatar Stern Chow avatar MarsMC avatar Jeff Carpenter avatar Wayn W avatar BoringDoggie avatar Yingfei(Jeremy) Xiang avatar  avatar Lil2J avatar Lex.Chen avatar Daxiong avatar Shi Yu avatar YoungJoon Jang avatar Motoki Wu avatar Fangyuan Yu avatar  avatar Diwank Singh Tomer avatar  avatar  avatar  avatar Qinyuan Cheng avatar FeifeiLee avatar ben avatar zxchen avatar Mathias Engel avatar 南栖 avatar Jeff Araujo avatar  avatar hongshengxin avatar Zhijie Yang avatar Tsun-Yi Yang avatar Francesco Baldassarri avatar Ji Seung Yang avatar evan.chris.ho avatar Yerim Ju avatar  avatar  avatar smellslikeml avatar louix avatar SamLast avatar  avatar Max Ilyin avatar  avatar Luna_yyq avatar Marx HA avatar Kuan avatar Matt Shaffer avatar Asim Shah avatar  avatar  avatar Rodolfo Rodriguez Girbes avatar cin-hubert avatar  avatar Hakan Moray avatar Aidan Lew avatar Luis Roel avatar zetsukhun avatar Mohsen Fayyaz avatar Ibne Farabi Shihab avatar Richard avatar Vincent Hong avatar Mitul avatar SeshurajuP avatar  avatar Mohammad Reza Taesiri avatar Zean avatar Gosicfly avatar Jiankai Li avatar  avatar  avatar dongkai.liang avatar Tsu-Jui Fu avatar Minsoo Kim avatar Yunmo Koo avatar  avatar  avatar Suhong Moon avatar 唐国梁Tommy avatar Ryan Tabrizi avatar Ja (Thanakul) Wattanawong avatar Jie Feng avatar Licong Guan avatar Tiancheng Zhao (Tony)  avatar  avatar  avatar Joseph cheng avatar Ren Tianhe avatar Tokarev Igor avatar Myungchul Shin avatar Renat Zayashnikov avatar

Watchers

Lex.Chen avatar Amir Gholami avatar XY avatar Kostas Georgiou avatar  avatar Matt Shaffer avatar

llm2llm's Issues

where is run_all.sh

Thank you for open-sourcing this work . I'd like to try it on my own dataset. But I cannot find the complete running pipeline 'run_all.sh'. Is it missing?

Insightful Connection to My Previous Paper

I recently read your paper and it is a great paper. Your research provides valuable insights into LLM-based data augmentation.

As I was reading your paper, I couldn't help but notice the parallels between your findings and the work AI2 and I published last year in EMNLP, titled "[Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation]." Our paper delves into the targeted data augmentation for MWP solving, which might complement and extend the discussions in your paper.

Therefore, I was wondering if you might consider acknowledging our work in your paper, as it could provide additional depth to the understanding and implications of your findings for the readers. I would be more than happy to discuss this further or provide any additional information you might need regarding my work.

Other Tasks

There just provide code for GSM8K, Will the code on other datasets be provided?
What is the approximate time if provided, thx~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.