andreamad8 / ppcm Goto Github PK
View Code? Open in Web Editor NEWPlug-and-Play Conversational Models
License: MIT License
Plug-and-Play Conversational Models
License: MIT License
Hi,
When I run interact_adapter.py
, I used the below shell script.
#!/bin/bash
python3 main.py -D sentiment --label_class 2 --length 30 --num_samples 10 --interact --verbose --sample_starter 10 --load_check_point_adapter runs/very_positive_0/pytorch_model.bin
This code only considers the model with the positive adapter. But, this code does not consider other trained adapters, such as training with daily dialogue dataset or negative adapter.
How to consider various adapters simultaneously?
Because I try to use this for a demo on the web by using Flask API.
Thanks
Young-Jun Lee
Hi,
I wonder that is it possible to control Bag-of-Words (i.e., introduced in PPLM paper)?
If possible, then how to control it, and can you explain a description at the code level?
Always Thank you,
Young-Jun Lee
Hi, the main.py file is requesting the interact.py script to work. Besides, the dialogGPT_discr.py also requires importing some scripts from the data.emocap and data.empathetic folders which are not included.
Hi,
When I trained the residual adapter part, I encountered one import module error from train_supervised_adapter.py
, which is shown below:
with jsonlines.open(f) as reader:
for i, obj in enumerate(reader):
text = " ".join(tokenize.sent_tokenize(obj["hyp"]["PPLM"][0][-1])[:2])
score = resp_ppl(text)
if score>700:
continue
response.append(obj['conversation']['conversation']+[text])
In the above code, I couldn't find the function of resp_ppl,
so I removed that part for training the adapter module. Is it okay to proceed like this?
p.s. Sorry for ask again.
Thank you,
Young-Jun Lee
Hi,
When I ran the PPLM, I found there are some missing arguments (e.g., "entailment," "BCE," and "bag_of_words").
So, I added few lines as below, and then solved the issue. Would it be okay to add the code like this?
parser.add_argument("--entailment", type=bool, default=False)
parser.add_argument("--BCE", type=bool, default=False)
parser.add_argument("--bag_of_words", type=str, default=None)
Hi,
The idea of applying adapters to chatbot is great!
However, when I run python train_supervised_adapter.py --dataset SENT --label very_negative --iter 75 --lr 6.25e-4
, I got an error: "get_ppl() missing 1 required positional argument: 'starter'" in line 103. I find that there is a missing argument 'starter'.
Please help me solve this problem. Thanks!
Hi,
After reading your paper PPCM, I have a question related to the Plug-and-Play Adapters. As mentioned in your paper, we have to generate style-specific datasets using PPLM with some trained discriminators. Then, we only optimize the residual adapter parameters to steer the output of the original LM distribution (we don't need to optimize the LM parameter meaning that fixed parameters). At the decoding time, we use these trained models (not requiring a fixed number of iterations). This is what I understand of your paper; I don't know it is right. I am not sure.
And I tried to run your code. When I trained the residual adapter, I found a part that I didn't understand. In the train_supervised_adpater.py
file, there are 8 task ids (shown in the below figure).
TASK_MAP = {"very_negative":0, "very_positive":1, "toxic":2, "question":3, "Business":4, "SciTech":5, "Sports":6, "World":7}
But, in the modeling_adapter.py
file, there are 20 task ids, the default setting.
class MixAdapter(nn.Module):
def __init__(self, config, bottleneck_size=100, adapter_num=20):
super(MixAdapter, self).__init__()
# 20 adapters with task_id 0--20, when task_id==-1 means dont use adapter
self.mixadapter = nn.ModuleList([Adapter(config, bottleneck_size) for _ in range(adapter_num)])
def forward(self, x, task_id=-1):
if task_id==-1:
return x
else:
return self.mixadapter[task_id](x)
Thus, I want to know the meaning of the remaining part (i.e., 12 task ids). Would you explain how to work? I may not be able to understand it properly.
Sincerely,
Young-Jun Lee
Hi!
I can see in your code that you had experiment with Entailment discriminating-control.
but I can't see the result of that in your paper.
How was your result?
Does PPLM can successfully generate entailment-controlled response?
Or was it bad so you don't mention it in your paper?
Thank you very much!
How to use the code?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.