Light

damo-nlp-sg / chain-of-knowledge Goto Github PK

View Code? Open in Web Editor NEW

49.0 5.0 8.0 243 KB

[ICLR2024] Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources

License: MIT License

Python 98.90% Shell 1.10%

chain-of-knowledge's Introduction

Chain-of-Knowledge

1. Requirements

1.1 OPENAI_API_KEY

Create an account and get the API key for OpenAI (https://openai.com).

OPENAI_API_KEY=YOUR_KEY

1.2 SERPAPI_KEY

Create an account and get the API key for google retrieval (https://serpapi.com).

SERPAPI_KEY=YOUR_KEY

1.3 Install requirements

conda env create -f requirements.yaml

1.4 Setup Entity Linking for SPARQL

For linking text to KG facts using pretrained models for now.

Download mGENRE entity linking files:

mkdir -p utils/retrieval/linking_data/genre
cd utils/retrieval/linking_data/genre
wget https://dl.fbaipublicfiles.com/GENRE/lang_title2wikidataID-normalized_with_redirect.pkl
wget https://dl.fbaipublicfiles.com/GENRE/titles_lang_all105_marisa_trie_with_redirect.pkl
cd ../..

Preprocess entity information:

python linking.py process_titles

2. Instruction-tuning of adaptive query generator (AQG)

python sft_trainer.py \
    --model_name $BASE_MODEL \
    --dataset_name $DATASET_NAME \
    --load_in_8bit \
    --use_peft \
    --batch_size 32 \
    --gradient_accumulation_steps 2 \
    --output_dir $OUTPUT_DIR \
    --num_train_epochs 3 \
    --push_to_hub True\
    --hub_model_id $HUB_MODEL_ID \

3. Inference chain-of-knowledge (CoK)

python run.py \
    --model gpt-3.5-turbo-0613 \
    --dataset $DATASET_NAME \
    --output $OUTPUT_DIR \
    --step True \

Citation

@inproceedings{
    li2024cok,
    title={Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources},
    author={Xingxuan Li and Ruochen Zhao and Yew Ken Chia and Bosheng Ding and Shafiq Joty and Soujanya Poria and Lidong Bing},
    booktitle={International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=cPgh4gWZlz}
}

chain-of-knowledge's People

Contributors

Stargazers

Watchers

Forkers

andy-hhh-hub 2jimoo just4jc linuer ldh127 caisa-lab pengfeihepower meguriri

chain-of-knowledge's Issues

s1_domain

您好，可以给一下域选择的代码么，代码中没有

sparql-contrastive data not in HF

Great job on the project!

I attempted to execute the sparql_contrast.sh , but it appears that the dataset veggiebird/sparql-contrastive is missing from HF.

Thank you!

Could you please Add "retrieve_wikitable_knowledge" and "retrieve_flashcard_knowledge" code

Hello. could you please Add "retrieve_wikitable_knowledge" and "retrieve_flashcard_knowledge" code and related knowledge data, thank you very much!!

截屏2024-05-28 10 39 43

@chiayewken @xingxuanli

error in “python run.py”

File "/XXX/CoK/utils/hotpotqa_parser.py", line 89, in update_rationales_step_by_step
domains = data_point["s1_domains"]

KeyError: 's1_domains'

Where is data_point["s1_domains"] assigned a value

Request for Test Set Details and Availability

Hello,

Thank you for the detailed evaluation presented in the paper. I am particularly interested in the performance analysis of ChatGPT and instruction-tuned LLaMA-2-7B on SQL and SPARQL generation as demonstrated in Table 12.

In the paper, it is mentioned that SPARQL was evaluated on 4,779 samples from LC-quad and KQA-pro, and SQL was evaluated on 15,900 samples from WikiSQL. However, I couldn't find the specific details or availability of the test sets used for these evaluations.

Could you please provide more information about the test sets, or share the test sets themselves, if possible? This would greatly aid in reproducing the results and further understanding the performance metrics presented.

Thank you for your assistance.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs