GithubHelp home page GithubHelp logo

marrythetoilet / datg Goto Github PK

View Code? Open in Web Editor NEW

This project forked from iaar-shanghai/datg

0.0 0.0 0.0 124.04 MB

Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

Home Page: https://arxiv.org/abs/2402.11218

License: Apache License 2.0

Python 46.16% Jupyter Notebook 53.84%

datg's Introduction

πŸ•ΈοΈ DATG: Controlled Text Generation for Large Language Models with Dynamic Attribute Graphs

A framework designed for controlled text generation in Large Language Models using dynamic attribute graphs.
Refer to our arXiv paper for detailed insights and methodology.

License: Apache arXiv Paper

Introduction

DATG (Dynamic Attribute Graphs-based Text Generation) is an innovative approach designed for controlled text generation, enabling precise control over text attributes during the decoding phase while maintaining the natural fluency of the text. This method leverages dynamic attribute graphs to evaluate and adjust key terms related to target attributes, thereby controlling the attributes of the generated text effectively without compromising text quality.

framework

Project structure

.
β”œβ”€β”€ .cache           # Cache some results during evaluation to prevent losing all results
β”œβ”€β”€ .gitattributes   # Git attributes
β”œβ”€β”€ .gitignore       # Ignore files for git
β”œβ”€β”€ README.md        # Project Description
β”œβ”€β”€ analyst.py       # Generate the statistics
β”œβ”€β”€ config.py        # Configuration file for experiment
β”œβ”€β”€ data             # Data for experiment, training classifier and evaluation
β”œβ”€β”€ main.py          # Core file for running the experiment
β”œβ”€β”€ method           # Different CTG methods (including ours)
β”œβ”€β”€ requirements.txt # Required packages
β”œβ”€β”€ results          # Results of the experiment
β”œβ”€β”€ stats            # Statistics of the experiment generated by analyst.py using the results
β”œβ”€β”€ train            # Scripts for training classifiers and other models
└── utils            # Utilities for the project

Usage

Setup

  • Install Python 3.8.18.

  • Clone the project repository.

  • Install required dependencies:

  • pip install -r requirements.txt

  • Complete Configuration in config.py

    Before initiating experiments, configure config.py to suit your experimental setup:

    • Model Paths: Specify the locations of your Large Language Models (LLMs) in MODEL_PATHS. Ensure these paths are accurate to enable proper model loading.

    • Classifier Configuration: Assign paths for internal classifiers (used during generation) and external classifiers (used for evaluation) within TASK_CONFIGURATIONS. Utilize the Jupyter notebooks in the train directory for training these classifiers, and update their paths accordingly.

    • Data and Tasks: Define your specific datasets and tasks in TASK_CONFIGURATIONS, including dataset paths and task-specific settings.

    • Perspective API: If required, insert your Perspective API keys into GOOGLE_API_KEYs after obtaining them. Confirm your system's connectivity to https://commentanalyzer.googleapis.com for accessing API services.

    Ensure all paths, APIs, and configurations are set correctly before running your experiments.

Running Experiments

  1. To run an experiment, use the following command:

    python main.py --model_name <MODEL_NAME> --task_name <TASK_NAME>

    Replace <MODEL_NAME> with one of the available model names: [phi2_3B_Base, llama2_13B_Base, falcon_7B_Base, opt_7B_Base, alpaca_7B_Base] , or more models you set in the config.py.

    Replace <TASK_NAME> with one of the available task names: [toxicMitigation, 2Positive, 2Negative], or more tasks you set in the config.py.

    Example:

    python main.py --model_name phi2_3B_Base --task_name toxicMitigation

Generating Statistics

  1. After running experiments, you can generate statistics by executing:
    python analyst.py
    This will analyze the results and generate statistical data based on the output from the experiments.

HuggingFace models used in our research

  • tatsu-lab/alpaca-7b-wdiff # Remember to convert the model to HF format and name it alpaca-7b-hf
  • tiiuae/falcon-7b
  • meta-llama/Llama-2-13b-hf
  • facebook/opt-6.7b
  • microsoft/phi-2
  • BAAI/bge-large-en-v1.5
  • openai-community/gpt2-large
  • FacebookAI/roberta-base

Results for Experiment-20240212

  • Effectiveness and Fluency: The DATG approach ranks highly in both toxicity mitigation and sentiment transformation tasks, effectively reducing unwanted attributes while maintaining text fluency. This demonstrates the method's ability to produce high-quality, coherent text across different contexts and requirements.
  • Attribute Control Validation: The success across various datasets confirms our hypothesis that adjusting a few key attribute words can effectively control the text's overall sentiment or toxicity. This strategic modification ensures that the changes in attributes do not compromise the natural flow and coherence of the generated text.
  • Consistency Across Models: The DATG method shows consistent performance in reducing toxicity and transforming sentiment across different LLMs and datasets. This stability across various conditions underscores the robustness of our approach, highlighting its adaptability to different LLMs without losing quality.

  • Speed Advantage: DATG exhibits faster generation speeds compared to PREADD and FUDGE, emphasizing the efficiency of our approach even when integrating complex attribute control mechanisms.
  • Potential for Speed Improvement:Further enhancement in generation speed could be achieved by pre-generating extensive attribute graphs, allowing for faster identification of relevant sub-graphs and nodes during generation.

Citation

@article{DATG,
    title={Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs},
    author={Xun Liang and Hanyu Wang and Shichao Song and Mengting Hu and Xunzhi Wang and Zhiyu Li and Feiyu Xiong and Bo Tang},
    journal={arXiv preprint arXiv:2402.11218},
    year={2024},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.