vipulraheja / coedit Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)
Home Page: https://aclanthology.org/2023.findings-emnlp.350/
Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)
Home Page: https://aclanthology.org/2023.findings-emnlp.350/
There are 2 output/Target, How transformers can be fine tuned on multiple output?
Please acknowledge me
Hi authors, Thank you very much for the great work.
From your Table 1 in the paper, it seems you have used around 86k data for training.
However, from the "train_coedit.jsonl" I downloaded, it only contains 69k data. After checking, I think the second last row (STYLE (Formalize)) is missing from the released "train_coedit.jsonl" file.
May I know if you have used this data portion to train the coedit model?
Best regards,
Michael
Hi! Thank you for your great work. The models are excellent at preserving the salient content while making the requested edits. I was wondering if the dataset has already been released or will be soon? I couldn't find it when I searched just now.
Click here for Dataset link
Below is the following way, as per my understanding , Is it correct โ โ
The columns/features from DiscoFuse dataset
that will be the input to the encoder
and decoder
are:
coherent_first_sentence
coherent_second_sentence
incoherent_first_sentence
incoherent_second_sentence
The encoder
will take these four columns as input and encode them into a sequence of hidden states. The decoder
will then take these hidden states as input and decode them into a new sentence that fuses the two original sentences together.
The discourse type, connective_string, has_coref_type_pronoun, and has_coref_type_nominal columns will not be used as input to the encoder or decoder. These columns are used to provide additional information about the dataset, but they are not necessary for the task of sentence fusion.
Please correct me if I am wrong; otherwise, if this understanding is right, how shall I implement this task practically?
Hello,
I recently read your interesting paper. The results look very promising and I'm excited to try out the COEDIT models.
In Section 4 "Experimental Setup" of the paper, several evaluation datasets and metrics are described. However, I didn't see the code for computing these metrics or the full evaluation pipeline provided in the linked GitHub repo.
Would it be possible for you to please open-source the code you used to evaluate the models and compute the reported metrics? Having access to the exact evaluation scripts and metric implementations would help ensure reproducibility and make it easier for others to benchmark against COEDIT and validate the results.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.