gem-benchmark / nl-augmenter Goto Github PK

View Code? Open in Web Editor NEW

760.0 760.0 196.0 91.64 MB

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

License: MIT License

Python 86.87% Jupyter Notebook 13.10% Makefile 0.03%

nl-augmenter's People

Contributors

Stargazers

Watchers

Forkers

ns-moosavi gurunathparasaram qcwthu gentaiscool bryanwilie tsor13 zijwang uxyi tanay2001 trendingtechnology kentonmurray duonghuuthanh manandey tgoldsack1 ananyasaib iitmnlp abinayam02 sileod sirrob1997 priyanksonis cclauss rg089 sarvex kalebu luismond bhshri xudongolivershen anuwatavis ashutoshml rbz-99 tanfiona simonmeoni pierrecolombo emilechapuis paulozip evelynmitchell aruna20200 gautierdag wwydmanski multitude0099 philipehausner suchitradubey gxywang ashish3586 kainoj amanrocks11 aatlantise sangminwoo imperialite pawan2411 mukundvarmat gxlarson pvf2005 wschella luisolco boyleconnor akgeni saqibns mgobrain pabloamc agesb fsiar markusbayer109 mg1800 yizhen20133868 gokyori dumpmemory mirzarahim2197 yovakem anisha2102 stjordanis tonysun9 uyaseen gorkaurbizu poneill sotwi-zz marco-digio the-vmlr-lab dsfsi tdopierre asnota abhilashpal seungjaeryanlee romanplusplus xiaohk timothy22000 mrshu bigbench-submission venelink mmeyer39 clauderouxster yiwen-shi maobedkova ijindal mnamysl erlemar dragomirradev kvadityasrivatsa sajantanand denkle

nl-augmenter's Issues

The default performance evaluation shows strange results

Hi all,

If one runs the evaluate.py script against our transformation (#230), the results are very strange. The performance is too good, considering the dramatic changes made by our transformation.

Here is the performance of the model aychang/roberta-base-imdb on the test[:20%] split of the imdb dataset
The accuracy on this subset which has 1000 examples = 96.0
Applying transformation:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:19<00:00, 51.83it/s]
Finished transformation! 1000 examples generated from 1000 original examples, with 1000 successfully transformed and 0 unchanged (1.0 perturb rate)
Here is the performance of the model on the transformed set
The accuracy on this subset which has 1000 examples = 100.0

On the other hand, if we use non-default models, they produce reasonable results (kudos to @sotwi):

roberta-base-SST-2: 94.0 -> 51.0
bert-base-uncased-QQP: 92.0 -> 67.0
roberta-large-mnli: 91.0 -> 43.0

I speculate that the problem in the default test could be caused by some deficiency in the model aychang/roberta-base-imdb and / or the imdb dataset. But I'm not knowledgeable enough in the inner workings of the model to identify the source of the problem.

How to reproduce the strange results:

Get the writing_system_replacement transformation from #230.

cd to the NL-Augmenter dir.

Run this:

python3 evaluate.py -t WritingSystemReplacement

Expected results:

a massive drop in accuracy, similar to the results by @sotwi on non-default models, as mentioned above.

Observed results:

a perfect accuracy of 100.0.

`insert_abbreviation` incorrectly imports python file and incorrectly assumes resources are relative to transformation directory

These issues appear when trying to use this transformation outside of the root NL-Augumenter directory. For example in another sub-directory off the root directory. The fixes needed are the following:

import grammaire.py using the full import path: import transformations.insert_abbreviation.grammaire as grammaire
Remove sys.path.append("./transformations/insert_abbreviation")
Use file = os.path.join(os.path.dirname(os.path.abspath(__file__)), '<file_name>.txt') to get a handle on the current path relative to transformation.py script file. This will allow easy access to the two .txt resource files.

`english_inflectional_variation` throws ValueError when called

Here is the stack trace when the EnglishInflectionalVariation class is initialised:

File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/english_inflectional_variation/__init__.py", line 1, in <module>
    from .transformation import *
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/english_inflectional_variation/transformation.py", line 1, in <module>
    import random, lemminflect
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/lemminflect/__init__.py", line 49, in <module>
    spacy.tokens.Token.set_extension('inflect', method=Inflections().spacyGetInfl)
  File "spacy/tokens/token.pyx", line 47, in spacy.tokens.token.Token.set_extension
ValueError: [E090] Extension 'inflect' already exists on Token. To overwrite the existing extension, set `force=True` on `Token.set_extension`.

Standardize loading of different spacy models

Some of the transformations/filters use different spacy models (en, es, zh, de). The way it is loaded needs to be standardized. The function initialize_models in initialize.py needs to be re-written to accommodate language parameter and the following transformations/filters should be updated.

Once the changes are done, test the modules individually using pytest using the below command,

pytest -s --t=<module_name>

Transformations:

Filters:

Fix Docstrings and Add Evaluation Scores for Yes-No Question Transformation

Originally requested by @AbinayaM02 in #126 (review)

Error when evaluating TEXT_TO_TEXT_GENRATION

When running python evaluate.py -t ButterFingersPerturbation -task "TEXT_TO_TEXT_GENERATION" -p 1, there will be error of

Here is the performance of the model on the transformed set
Length of Evaluation dataset is 226
Traceback (most recent call last):
  File "evaluate.py", line 67, in <module>
    if_filter
  File "./NL-Augmenter/evaluation/evaluation_engine.py", line 41, in evaluate
    percentage_of_examples=percentage_of_examples,
  File "./NL-Augmenter/evaluation/evaluation_engine.py", line 115, in execute_model
    split=f"test[:{percentage_of_examples}%]",
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 44, in evaluate
    dataset, summarization_pipeline, transformation=operation
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 70, in transformation_performance
    pt_dataset, summarization_pipeline
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 81, in performance_on_dataset
    article, gold_summary = example
  File "./NL-Augmenter/dataset.py", line 301, in <genexpr>
    yield (datapoint[field] for field in self.fields)
TypeError: string indices must be integers

Fix Breaking Line in Gender Neutral Rewrite Transformation

This line is breaking the package.

self.nlp = ... should go under the __init__() method.

Language Detection

How can we detect, which language is used for the evaluation on the fly?
We want to apply the correct transformation in "generate" on the fly according to the current language...

Thanks in advance

Spacy behaves differently when testing one case vs testing all cases

It seems Spacy's tokenizer behaves differently when I run pytest -s --t=emojify and pytest -s --t=light --f=light.

For example, I added the following snippet in my generate() function:

print([str(t) for t in self.nlp(sentence)])

With input sentence "Apple is looking at buying U.K. startup for $132 billion."

pytest -s --t=emojify gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '132', 'billion', '.']

However, pytest -s --t=light --f=light gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$1', '32', 'billion.']

I use the fowling code to load spacy:

import spacy
from initialize import spacy_nlp
self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

It looks very strange. Am I overlooking something?

`GermanGenderSwap` missing `noun_pairs.json` file and incorrectly assumes the resources are on the script path

Hi @raft001,

It seems that in addition to issue #310 there are two other issues that need addressing:

noun_pairs.json is missing. This needed on line 17.
The script assumes that the resource *.json files will always be on the script path. Please instead do the following to resolve the path:
file = os.path.join(os.path.dirname(os.path.abspath(__file__)), '<file_name>.json')

Then current_path can be used as the absolute path to your resource files.

`NumberToWord` is not loadable (likely due to hyphens in folder name)

It does not appear possible to load the NumberToWord transformation after installing nlaugmenter.

https://github.com/GEM-benchmark/NL-Augmenter/blob/main/nlaugmenter/transformations/number-to-word/transformation.py

This is likely due to number-to-word breaking python's path loading.

Standardize module names - Transformation

The module number-to-word should be changed to number_to_word.

Solution:

Rename the folder from number-to-word to number_to_word
Add an entry number_to_word in the test/mapper.py file in the appropriate dictionary (either heavy or light transformation depending on the flag heavy)
Once added, test the module by executing

pytest -s --t=number_to_word

`disability_transformation` Unresolved references to SpaCy

When running the disability_transformation there are several unresolved references to the spacy_nlp variable. In particular on lines:

Line 55: doc = nlp(text)
Line 147: spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

Spacy upgrade to 3.0+

Hi there,
Just wondering - is there any reason spacy is locked with the old version spacy==2.2.4 in the main requirements.txt?

Spacy 3.0 was quite a big upgrade from 2.2.4, and 3.1.0 was just released today so it might make sense to look forward and make that a requirement instead.

I don't think any current implementations would break by this upgrade but I'm happy to make a PR for it and fix things if needed.

Pip installation of NL-Augmenter fails because of `tense` transformation

tense transformation requires a library called pattern . The library is forked and the forked version is used in requirements which is causing the pip to fail. To avoid pip from failing, change the requirements to the actual pip package.

`GermanGenderSwap` Violates Naming Convention

Our transformation packages so far have all been in snake_case. @raft001, I believe this is your transformation. Could you change this, please?

gender_culture_diverse_name returns nothing on some inputs

@XudongOliverShen

I just tried your transformation on some summarization inputs and it returns nothing there. I believe there may be an issue if a document does not include any entities.

issue while build

not sure why module is failing while building....i am using direct install for one library from git (see first line of requirements.txt)
it is working in local and also on multiple GCS vms (where i did dev).
reference:
https://github.com/GEM-benchmark/NL-Augmenter/pull/113/checks?check_run_id=3040556102

Cannot Run `evaluate.py` Script

I've tried running the evaluate.py script in this Colab notebook. I get the following error:

OSError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZNK3c104Type14isSubtypeOfExtESt10shared_ptrIS0_EPSo

Transformations are missing from the `test/mapper.py`

The mapper file needs to be updated with the transformation/filter names so that default installation and pytest happens for all the light transformations/filters.

unable to build due to pycurl

looks like system used to build the image is RHEL based, hence pycurl installation is failing, please refer to the solution in this link and install libraries accordingly.
fail build: https://github.com/GEM-benchmark/NL-Augmenter/pull/113/checks?check_run_id=3026110664
reference link: https://stackoverflow.com/questions/66419978/could-not-install-pycurl-7-43-0-6-on-python-3-8-8-rhel-8-3

Should we add a global seed for all transformations?

Almost all transformations such as, for example, butter_fingers_perturbation or replace_numerical_values use a seed in their constructor that is set to some value. How are we going to handle the global seed? we could easily set one in initialize.py that get's imported in each transformation and set that as the default, similar to what is currently done for spacy_nlp. Otherwise, we can also set it during evaluation, as far as I could tell that is not currently done but I think having a global default is a little cleaner.

Happy to make the required changes if that's something we'd want.

Loading of Filter Tests

I think there might be something broken with the filter tests, at least when I extended the test.json of the TextContainsKeywordsFilter to contain another test case:

{
    "type": "keywords",
    "test_cases": [
        {
            "class": "TextContainsKeywordsFilter",
            "args": {
                "keywords": ["in", "at"]
            },
            "inputs": {
                "sentence": "Andrew played cricket in India"
            },
            "outputs": true
        },
	{
            "class": "TextContainsKeywordsFilter",
            "args": {
                "keywords": ["sad"]
            },
            "inputs": {
                "sentence": "Andrew played cricket in India"
            },
            "outputs": false
        }
    ]
}

And then ran: pytest -s --f=keywords

It fails the test, although from my understanding it should still work properly. In particular, after printing self.keywords in the filter method, it seems like there is no new instance created for the new test case and the old keywords are still used which causes the second test case to fail.

Am I misusing something here? I ran into this when writing the tests for my addition of a filter.

`gender_neutral_rewrite` Unresolved references to spaCy and Unresolved List reference

When running the gender_neutral_rewrite there are several unresolved references to the spacy_nlp variable. In particular on line:

Line 27: self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

Please use from initialize import spacy_nlp to get a handle on the global spacy instance.

There is also an unresolved reference on Line 495: def generate(self, sentence: str) -> List[str]. List[str] is not resolvable. Should this be lower case? e.g. list[str]

`Formal2Casual` fails to load due to unavailable huggingface model

from nlaugmenter.transformations.formality_change.transformation import Formal2Casual

OSError: prithivida/parrot_adequacy_on_BART is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

The model (prithivida/parrot_adequacy_on_BART) is indeed not available on huggingface anymore. Perhaps an acceptable alternative is to use prithivida/parrot_adequacy_model instead?

Tests do not Check that Expected and Generated Outputs have Same Number of Sentences

This issue concerns the following line in the main test script:

NL-Augmenter/test/test_main.py

Line 26 in 27ab1d7

for pred_output, output in zip(perturbs, outputs):

The zip() builtin (which is used in the above-mentioned line to pair up expected sentences with generated sentences) clips the longer of its two inputted iterables to the length of the shorter iterable. E.g.:

>>> list(zip([1,2,3], [6,7,8,9,10]))
[(1, 6), (2, 7), (3, 8)]

This means that even if a transformation generates fewer sentences (e.g. 0) than the expected number of sentences, it will still pass and the later expected sentences will not get evaluated. This also makes it impossible to test affirmatively that a transformation does not generate any outputs for a given input.

I would recommend either asserting that the two iterables are of equal length, or replacing zip() with zip_longest().

`ocr_perturbation` requirements issues

The ocr_perturbation package requires trdg==1.6.0. However, under macOS 11.6 with Python 3.9 it will not install due to a dependency on pillow==7.0.0, which generates a RequiredDependencyException: zlib error.

Installing pillow==8.3.2 works fine but is too new for trdg==1.6.0.

Installing trdg==1.7.0 has a dependency conflicts with opencv-python:

ERROR: Cannot install opencv-python==4.5.3.56, trdg and trdg==1.7.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.3.56 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.2.54 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.2.52 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.1.48 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.46 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.42 depends on numpy>=1.17.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.40 depends on numpy>=1.17.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.3.0.38 depends on numpy>=1.17.3

`sentiment_emoji_augmenter` throws SyntaxWarning messages

When run it throws the following error messages:

/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/sentiment_emoji_augmenter/transformation.py:103: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if sentiment is "pos":
/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/sentiment_emoji_augmenter/transformation.py:106: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif sentiment is "neg":

`correct_common_misspellings` throws FileNotFoundError and incorrectly assumes resources are relative to transformation directory

Remove:

spell_corrections = os.path.join(
        "transformations", "correct_common_misspellings", "spell_corrections.json"
    )

Use file = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'spell_corrections.json') to get a handle on the current path relative to transformation.py script file.

Informal & Untested Suggestions for Possible Transformations

Here are some random ideas informally put which could be used for perturbations & augmentations. @vgtomahawk is making a formal list in this branch.

Meanwhile here is an informal list for the benefit of the participants.

Interchange positions of SRL AM arguments for non-overlapping AM arguments:
- Alex left for Delhi with his wife at 5 pm. --> Alex left for Delhi at 5 pm with his wife.
- "at 5 pm" (AM-TMP) and "with his wife" (AM-COM) can be exchanged: This is safe to do only with non-core arguments and non-overlapping arguments. Check what SRL is here.
The ButterFingersPertubation could be implemented for keyboard types other than English - like Devanagiri (Hindi, Marathi, Nepail), Shahmukhi (Urdu, Persian), South Indian languages (Tamil, Telugu, Kannada, Malayalam) or Chinese, etc.
Style transfer approaches could be interesting to look at - Changing formal to informal and vice versa. Check this model.

What the heck is going on? --> What is going on?
What you upto? --> What are you doing?

Word Order Changes: Active to Passive & vice versa, Topicalisation, Extraposition, Wh-fronting, (& vice versa) & other used in constituency tests.
Scrambling (for German, Turkic languages)
John went to the store to buy bread. --> To buy bread, John went to the store.

The above are only related to SentenceOperation. There are other transformation types too which could be looked at.

Add CUDA argument in evaluate.py to set the "is_cuda" flag in evaluate methods to False. (for non-Nvidia GPUs to use CPU)

Hi All,

I am using a Mac OS for my project so I am running into an issue when trying to evaluate my transformations. As I do not have Nvidia GPUs, I would like to use the CPU when working with PyTorch otherwise I would get an "AssertionError: Torch not compiled with CUDA enabled".

Mac OS users that do not have Nvidia GPU will have to set device = -1 to not use GPU:
MacOS: "AssertionError: Torch not compiled with CUDA enabled"
allenai/allennlp#877

This seems to be stemmed from the fact that there is currently no way to change the is_CUDA flag that is being set to TRUE by default in the evaluate() method inside evaluate_text_classification.py to FALSE. (There is code to set the device to 0 or -1 based on the is_cuda flag.)

I am able to run my evaluations by changing the is_cuda flag in the code. It will probably be better to make it an argument so that future users who want to use CPUs instead of GPUs to be able to do it when running python evaluate.py -t [transformation] -task [task_type]

I will be happy to make the required changes if that's something we'd want.

Thanks,
Tim

Swap Transformations

Thank you for your great work! It's super useful!

I have a suggestion for improvement -
Some transformations are working with a "swap" principle. For example, in GenderSwap, if we had "sister" in the original sentence then it would be transformed to 'brother" and vice versa.
There are scenarios when it's important to know what direction the transformation went, female to male or male to female. In my case for example, I want to compare the performances of my model on female/male sentences on inference time.

I really liked the way TenseTransformation works. You need to specify in the constructor what tense (past/present/future) you want to transform to.
Maybe that could be applicable for other swap transformations?

Thanks again!

Standardize module name - filters

speech-tag to speech_tag
token-amount to token_amount

`replace_spelling` (`SpellingTransformation`) sets the random seed for each word rather than for each sentence

Setting the random seed for each word leads to jumpy behavior, like the sentence not being transformed at all. The random seed should be set for the whole sentence (outside of the forloop).

Style paraphrasers work best in a two-stage pipeline, can re-use HuggingFace `generate(...)` APIs

Hi everyone, I'm the original author of the STRAP paraphrasers (paper link) which were recently accepted to NL-Augmenter (#227), an effort led by @Filco306. Excited to see these models in NL-Augmenter!

After discussing with @Filco306 and seeing the PR, I saw that 6 different variants of the paraphraser have been provided, a "Basic" style agnostic paraphraser as well as five style-specific paraphrasers (link). While the "Basic" paraphraser is implemented fine, for the style-specific paraphrasers it's recommended to use a two-step pipelined process ---

(1) normalize the text using the "Basic" paraphraser;
(2) pass the output from (1) through the style-specific paraphraser.

This is important since all style-specific paraphrasers were trained on the outputs of "Basic", so any other text is technically out-of-distribution. In an ablation study (-Inf PP. in Table 3 of the paper) we saw a significant drop in style transfer performance without this step. Moreover, the two-step process helps boost output diversity since the "Basic" paraphraser strips input style. This should be fairly simple to implement.

Another minor point is that the models are fully compatible with the new HuggingFace generate(...) APIs, which provide additional functionality compared to what was originally implemented in my repository (in other words, this import can be avoided). Here's an example of how to do it,

out = gpt2.generate(
    input_ids=gpt2_sentences[:, 0:init_context_size],
    max_length=gpt2_sentences.shape[1],
    return_dict_in_generate=True,
    eos_token_id=eos_token_id,
    output_scores=True,
    do_sample=top_k > 0 or top_p > 0.0,
    top_k=top_k,
    top_p=top_p,
    temperature=temperature,
    num_beams=beam_size,
    token_type_ids=segments[:, 0:init_context_size]
)

Also CCing the NL-Augmenter reviewers for the style paraphraser to keep them in the loop --- @sebastianGehrmann @Nickeilf @juand-r @kaustubhdhole

noun_pairs.json in the outermost folder

Hi @raft001
noun_pairs.json appears in the outermost folder.
https://github.com/GEM-benchmark/NL-Augmenter/blob/main/noun_pairs.json

This needs to be removed and checked if all works fine without it.

`summarization_transformation` has unresolved reference to spaCy

When running this transformation there are several unresolved references to the spacy_nlp variable. In particular on line:

Line 21: self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm", disable=['ner','textcat'])

Please use from initialize import spacy_nlp to get a handle on the global spacy instance.

Spacy Loading can be done once

Many transformations load spacy multiple times and reparse the same utterance. We will need a mechanism to load spacy once and parse once or at least cache the parse for a string so that when running all transformations together, there is no repetition of parsing.

`p1_noun_transformation` wptools dependency issues

The p1_noun_transformation relies on wptools as a dependency. However, wptools depends on pycurl. Unfortunately, pycurl keeps throwing the following message when used:

  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/p1_noun_transformation/__init__.py", line 1, in <module>
    from .transformation import *
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/p1_noun_transformation/transformation.py", line 9, in <module>
    import wptools
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/__init__.py", line 23, in <module>
    from . import core
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/core.py", line 14, in <module>
    from . import request
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/request.py", line 17, in <module>
    import pycurl
ImportError: pycurl: libcurl link-time ssl backends (secure-transport) do not include compile-time ssl backend (openssl)

PR Filter label

There should probably be another label called "filter" to quickly check in the PR's which transformations/filters have already been implemented. Both of my PRs are filters and should therefore not have a transformation label.

Change batch size and number of visible devices for text-style-transfer

Hi @Filco306

Thank you for your great work to make the powerful paraphrasing model easily accessible through HuggingFace! Now it is much easier for me to work with it without the hassle of handling complicated dependencies!

But is there any way for us to use a larger batch size and more GPUs to accelerate the paraphrasing process? Now it I could use only one GPU and a small batch size. I read your implementation here but there does not seem to be an easy to do either of them.

Thank you. I am looking forward to your reply.

`re.sub` method error during the evaluation

Hi,
While running the evaluate method (for #246), I get an error in my re.sub method for one of the tests --most likely due to a problem with the escape characters. I can replace it with string.replace to solve the problem. However, this branch is already merged. Do you suggest creating a new branch or to leave the corresponding eval columns empty?

Is the first test case skipped?

When adjusting the tests for #146 I noticed that I almost never needed to adjust the first test case in each test.json but all the others. It almost felt as if the first one was being skipped since it is so unlikely that all other test cases needed slight adjustments but the first one always perfectly matched. Can someone quickly check if everything works as intended there? Could very well be chance as well but just to make sure.

Typos discovered by codespell

codespell --ignore-words-list="fro,ist,oder"

./dataset.py:122: relavent ==> relevant
./dataset.py:143: hierachy ==> hierarchy
./notebooks/Write_a_sample_transformation.ipynb:1442: tht ==> the, that
./notebooks/Write_a_sample_transformation.ipynb:1718: exisiting ==> existing
./evaluation/evaluate_text_generation.py:84: upto ==> up to
./transformations/change_two_way_ne/README.md:11: implemetation ==> implementation

Some transformations should have flag `heavy=True` since it's transformer-based

Set the heavy flag to True and add them to the test/mapper.py.

Transformations:

neural_question_paraphraser
mr_value_replacement
protaugment_diverse_paraphrase

Filters:

oscillatory_hallucination

Data augmentation methods and filters that require the entire dataset

Hello!

First of all, thanks for the effort to build such a collaborative framework!

At the moment, the augmentation methods and filters are only provided with a single example per call. Since there are many techniques that need the whole dataset with the class information (to be conditioned on the class, to interpolate instances, etc.), I wanted to ask if there are plans to add this to this framework?

`concept2sentence` by default assumes that a CUDA device is available for pytorch

In the __init__ on line 42:

device='cuda'

This transformation always assumes that a CUDA device is available. However, it should check first to see if a CUDA device available using this helper function from pytorch: https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html

If not, then CPU should be used instead.

Collecting huggingface-hub<0.1.0
  Downloading huggingface_hub-0.0.8-py3-none-any.whl (34 kB)
Collecting sacremoses
  Downloading sacremoses-0.0.45-py3-none-any.whl (895 kB)
Collecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
Collecting filelock
  Downloading filelock-3.0.12-py3-none-any.whl (7.6 kB)
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/importlib_metadata-4.6.0.dist-info/METADATA'

Probably, somebody has some idea about this error that occurred in many PRs recently.

Thanks!

gem-benchmark / nl-augmenter Goto Github PK

nl-augmenter's People

Contributors

Stargazers

Watchers

Forkers

nl-augmenter's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs