The llm-reversal's intro from alif-munim

llm-reversal's Introduction

Curing the Reversal Curse in Language Models

A minimal reproduction of the github repository lukasberglund/reversal_curse and corresponding paper by Berglund et al. The aim of this repository is to evaluate the reversal curse phenomenon in various language model architectures and explore methods to mitigate it.

Notes

An especially interesting paper related to this domain is ROME (Rank-One Model Editing), first described in Locating and Editing Factual Associations in GPT by Meng et al. It turns out fact completion in transformer-based language models can causally traced within specific hidden states, and the facts themselves can be localized in the MLP layers of the transformers. The MLP blocks work as key-value stores, with the last subject token acting as the key, and the MLP output value encoding properties about that key.

Experiments

Currently playing around with bi-directional language models including BART and T5 to see if they can automatically capture bidirectional factual associations during training.
Reverse associations can also be manually inserted after training. Recent work proposes new editing methods that can insert new facts bi-directionally in one go.

Recommend Projects

alif-munim / llm-reversal Goto Github PK

llm-reversal's Introduction

Curing the Reversal Curse in Language Models

Notes

Experiments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs