GithubHelp home page GithubHelp logo

damo-nlp-sg / chain-of-knowledge Goto Github PK

View Code? Open in Web Editor NEW
49.0 5.0 8.0 243 KB

[ICLR2024] Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources

License: MIT License

Python 98.90% Shell 1.10%

chain-of-knowledge's Introduction

Chain-of-Knowledge

1. Requirements

1.1 OPENAI_API_KEY

Create an account and get the API key for OpenAI (https://openai.com).

OPENAI_API_KEY=YOUR_KEY

1.2 SERPAPI_KEY

Create an account and get the API key for google retrieval (https://serpapi.com).

SERPAPI_KEY=YOUR_KEY

1.3 Install requirements

conda env create -f requirements.yaml

1.4 Setup Entity Linking for SPARQL

For linking text to KG facts using pretrained models for now.

Download mGENRE entity linking files:

mkdir -p utils/retrieval/linking_data/genre
cd utils/retrieval/linking_data/genre
wget https://dl.fbaipublicfiles.com/GENRE/lang_title2wikidataID-normalized_with_redirect.pkl
wget https://dl.fbaipublicfiles.com/GENRE/titles_lang_all105_marisa_trie_with_redirect.pkl
cd ../..

Preprocess entity information:

python linking.py process_titles

2. Instruction-tuning of adaptive query generator (AQG)

python sft_trainer.py \
    --model_name $BASE_MODEL \
    --dataset_name $DATASET_NAME \
    --load_in_8bit \
    --use_peft \
    --batch_size 32 \
    --gradient_accumulation_steps 2 \
    --output_dir $OUTPUT_DIR \
    --num_train_epochs 3 \
    --push_to_hub True\
    --hub_model_id $HUB_MODEL_ID \

3. Inference chain-of-knowledge (CoK)

python run.py \
    --model gpt-3.5-turbo-0613 \
    --dataset $DATASET_NAME \
    --output $OUTPUT_DIR \
    --step True \

Citation

@inproceedings{
    li2024cok,
    title={Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources},
    author={Xingxuan Li and Ruochen Zhao and Yew Ken Chia and Bosheng Ding and Shafiq Joty and Soujanya Poria and Lidong Bing},
    booktitle={International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=cPgh4gWZlz}
}

chain-of-knowledge's People

Contributors

xingxuanli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

chain-of-knowledge's Issues

s1_domain

您好,可以给一下域选择的代码么,代码中没有

sparql-contrastive data not in HF

Great job on the project!

I attempted to execute the sparql_contrast.sh , but it appears that the dataset veggiebird/sparql-contrastive is missing from HF.

Thank you!

error in “python run.py”

File "/XXX/CoK/utils/hotpotqa_parser.py", line 89, in update_rationales_step_by_step
domains = data_point["s1_domains"]

KeyError: 's1_domains'

Where is data_point["s1_domains"] assigned a value

Request for Test Set Details and Availability

Hello,

Thank you for the detailed evaluation presented in the paper. I am particularly interested in the performance analysis of ChatGPT and instruction-tuned LLaMA-2-7B on SQL and SPARQL generation as demonstrated in Table 12.

In the paper, it is mentioned that SPARQL was evaluated on 4,779 samples from LC-quad and KQA-pro, and SQL was evaluated on 15,900 samples from WikiSQL. However, I couldn't find the specific details or availability of the test sets used for these evaluations.

Could you please provide more information about the test sets, or share the test sets themselves, if possible? This would greatly aid in reproducing the results and further understanding the performance metrics presented.

Thank you for your assistance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.