GithubHelp home page GithubHelp logo

ppcm's People

Contributors

andreamad8 avatar michalrzak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ppcm's Issues

Execute interact.py

Hi,

When I run interact_adapter.py, I used the below shell script.

#!/bin/bash

python3 main.py -D sentiment --label_class 2 --length 30 --num_samples 10 --interact --verbose --sample_starter 10 --load_check_point_adapter runs/very_positive_0/pytorch_model.bin

This code only considers the model with the positive adapter. But, this code does not consider other trained adapters, such as training with daily dialogue dataset or negative adapter.

How to consider various adapters simultaneously?

Because I try to use this for a demo on the web by using Flask API.

Thanks
Young-Jun Lee

How to control Bag-of-Words?

Hi,

I wonder that is it possible to control Bag-of-Words (i.e., introduced in PPLM paper)?

If possible, then how to control it, and can you explain a description at the code level?

Always Thank you,
Young-Jun Lee

Interact script is missing

Hi, the main.py file is requesting the interact.py script to work. Besides, the dialogGPT_discr.py also requires importing some scripts from the data.emocap and data.empathetic folders which are not included.

Module "resp_ppl" is missing

Hi,

When I trained the residual adapter part, I encountered one import module error from train_supervised_adapter.py, which is shown below:

with jsonlines.open(f) as reader: 
        for i, obj in enumerate(reader):
            text = " ".join(tokenize.sent_tokenize(obj["hyp"]["PPLM"][0][-1])[:2])
            score = resp_ppl(text)
            if score>700:
                continue
            response.append(obj['conversation']['conversation']+[text])

In the above code, I couldn't find the function of resp_ppl, so I removed that part for training the adapter module. Is it okay to proceed like this?

p.s. Sorry for ask again.

Thank you,
Young-Jun Lee

There are some missing arguments (e.g., "entailment", "BCE", and "bag_of_words")

Hi,

When I ran the PPLM, I found there are some missing arguments (e.g., "entailment," "BCE," and "bag_of_words").

So, I added few lines as below, and then solved the issue. Would it be okay to add the code like this?

parser.add_argument("--entailment", type=bool, default=False)
parser.add_argument("--BCE", type=bool, default=False)
parser.add_argument("--bag_of_words", type=str, default=None)

get_ppl() missing 1 required positional argument: 'starter'

Hi,

The idea of applying adapters to chatbot is great!

However, when I run python train_supervised_adapter.py --dataset SENT --label very_negative --iter 75 --lr 6.25e-4, I got an error: "get_ppl() missing 1 required positional argument: 'starter'" in line 103. I find that there is a missing argument 'starter'.

Please help me solve this problem. Thanks!

I have a question about the residual adapter

Hi,

After reading your paper PPCM, I have a question related to the Plug-and-Play Adapters. As mentioned in your paper, we have to generate style-specific datasets using PPLM with some trained discriminators. Then, we only optimize the residual adapter parameters to steer the output of the original LM distribution (we don't need to optimize the LM parameter meaning that fixed parameters). At the decoding time, we use these trained models (not requiring a fixed number of iterations). This is what I understand of your paper; I don't know it is right. I am not sure.

And I tried to run your code. When I trained the residual adapter, I found a part that I didn't understand. In the train_supervised_adpater.py file, there are 8 task ids (shown in the below figure).

TASK_MAP = {"very_negative":0, "very_positive":1, "toxic":2, "question":3, "Business":4, "SciTech":5, "Sports":6, "World":7}

But, in the modeling_adapter.py file, there are 20 task ids, the default setting.

class MixAdapter(nn.Module):
    def __init__(self, config, bottleneck_size=100, adapter_num=20):
        super(MixAdapter, self).__init__()
        # 20 adapters with task_id 0--20, when task_id==-1 means dont use adapter
        self.mixadapter = nn.ModuleList([Adapter(config, bottleneck_size) for _ in range(adapter_num)])
        
    def forward(self, x, task_id=-1):
        if task_id==-1:
            return x
        else:
            return self.mixadapter[task_id](x)

Thus, I want to know the meaning of the remaining part (i.e., 12 task ids). Would you explain how to work? I may not be able to understand it properly.

Sincerely,
Young-Jun Lee

About calculate dist score

Why calculate the dist2 of each sentence? And when calculating dist2, for example, dist2=16/17โ‰ˆ0.94; when outputting, it outputs 1-dist2?
image

Entailment experiment result

Hi!
I can see in your code that you had experiment with Entailment discriminating-control.
but I can't see the result of that in your paper.

How was your result?
Does PPLM can successfully generate entailment-controlled response?
Or was it bad so you don't mention it in your paper?

Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.