sforaidl / decepticonlp Goto Github PK

View Code? Open in Web Editor NEW

15.0 9.0 11.0 107 KB

Python Library for Robustness Monitoring and Adversarial Debugging of NLP models

License: MIT License

Makefile 3.90% Python 96.10%

natural-language-processing adversarial-attacks deep-learning robust-optimization text-processing python

decepticonlp's People

Contributors

Stargazers

Watchers

Forkers

rajaswa someshsingh22 rohts-patil harsh4799 abheesht17 shashank-m parantak vl97 gchhablani sarveshwar22 shubhamrgandhi

decepticonlp's Issues

Implementing WMD

I'll be implementing Word Mover's distance. I'll be using gensim. Since we've added tensorflow to our dependencies now, I don't think gensim should be an issue.

Add word assertion in transforms.py

All of the mentioned transforms until now should expect words. However, even sentences can pass and therefore improper transforms will be made for example

from decepticonlp.transforms.perturb import *
insert_space("Hey There")
# This can give the result as "Hey  there"

swap("Hey  There")
# This can give the result as "Hey T here" which corresponds more to an add operation

delete("Hey  There")
# This can give the result as "Hey There" which corresponds more to an add operation

Also, word assertions of first and last characters can also be violated

from decepticonlp.transforms.perturb import *
swap("Hey There")
# This can give the result as "Hey T here" which swaps the first letter of a word

delete("Hey There")
# This can give the result as "Hey here" which deletes the first letter of a word

Add an assert statement that considers words only

Does not have to be this particular file but the goal is to provide a config for pytest to ignore lines in abstract methods like raise NotImplementedError, etc as these files should not be reported in code coverage.

Other things could be if __name__ == "__main__"

Similarity metrics (New)

Needleman-Wunsch Algorithm
Smith-Waterman Algorithm
These algorithms were originally developed for DNA sequencing but I read on SO, that they are at times used as string similarity metrics as well as they account for mismatches and gaps (spaces). Moreover, we can penalize gaps and mismatches according to a value the users choice.
Should we implement this, @rajaswa and @someshsingh22? If yes, then I'll do it in some time.

Implementation of Contractions (Paraphrasing)

Suppose we have a string with two words "how is". We can replace it with its contraction "how's". This can be done for multiple cases like:
he will: he'll
he had: he'd
and so on and so forth.
This paper does it for Question Answering/Machine Translation: https://www.aclweb.org/anthology/P18-1079.pdf.

I'm sure there are many other papers which have examples of replacing words with their respective contractions.

As @rajaswa mentioned, this isn't exactly a character-level perturbation and comes under the domain of paraphrasing. So, we can have multiple other examples of paraphrasing and implement those.

Implementing support for embeddings

This module needs to be planned well as we will need this for all our future implementations and will be required at all steps.
Currently, there are a lot of libraries available for this, common ones including Flair and Torchtext. We need a pipeline that will convert any textual input to vectors/matrices for further computation.

Plan for the pipeline [UPDATED]

I will put the assumed pseudocode as well first let's discuss the features / plan in words
NOTE : THIS IS FOR SINGLE PASS ATTACKS ONLY
We missed out on approaches where Black box attacks get classification results from models

SINGLE PASS

Load Dataset
Implement Classification, Translation, NER, Entailment in order
Add standard datasets of each type IMDB, English-Chinese SST etc
Data Loader
We can offer our own as well Torchtext/Allennlp
User can define his own dataloaders
Create Adversaries
Data Loaders / Datasets keeping option (2) in mind and give user an option of top k attacks to be kept
User implements model
User tests his model on actual dataset and adversarial dataset
Display the results
Give users a grid option for metrics, extractors, and transforms
Show user the ETA
Show top k best attacks and their results.

TRAIN

User implements his test function
Set grid trainer
Pass the test function to the trainer along with grid to the grid trainer
Train the decepticon
Generate adversaries from top_k results
Show results

Additional - Can add three version of decepticons, strong, stealthy and balanced top_k rankings will be done on basis of fall of accuracy (fall), metric distance, weighted-mean

Adding more characters to the dictionary in the typo function

In decepticonlp/transforms/perturb.py, in the function typo, we have defined the Python dictionary a certain way, with the keys as all the characters, and their corresponding values as the characters close to the respective key on the QWERTY keyboard. But we haven't taken digits (0-9) into account. Also, we might have missed out on a few alphabetic characters as well.
For example,
1.
Our implementation: "e": ["w", "s", "d", "r"]
Their implementation: "e": ["2","@","3","#","4","$","w","r","s","d","f"]
2.
Our implementation: "h": ["g", "y", "u", "j", "n", "b"]
Their implementation: "h":["t","y","u","g","j","b","n","m"]

For details, have a look at this (under the section QWERTY):
https://towardsdatascience.com/data-augmentation-library-for-text-9661736b13ff

They have used "One Keyword Distance Error" while deciding which characters are in proximity on the QWERTY Keyboard.

I am a bit doubtful about special characters though, since users tend to remove them during text pre-processing. So, I leave that to your discretion.

Even if we ignore the extra alphabetic characters, I think numeric characters must be added.

Implement keyword Extractors

Implement Temporal Score, Tailed Temporal Score, Combined Temporal Score
from here
and Semantic Similarity
from here

in extractor/extractor.py
also add the unit test for it

visual character return none for unicode as well

from decepticonlp.transforms.perturb import *
print(visual_similar_chars('shashank','unicode','visual'))

The above code returns none sometimes, perhaps due to *arg instead of **kwargs
Bug source

Character level embedding | Perturbation

If possible, implement as a character perturbation:
Find words with character embeddings in proximity (in hyperspace) to the word that is being edited.

RESOURCES:

https://towardsdatascience.com/the-definitive-guide-to-bidaf-part-2-word-embedding-character-embedding-and-contextual-c151fc4f05bb
https://arxiv.org/pdf/1812.05271.pdf
https://towardsdatascience.com/besides-word-embedding-why-you-need-to-know-character-embedding-6096a34a3b10

I'll update the issue with more resources and try to implement this as well

Clean transorms/preturb.py swap

decepticonlp/decepticonlp/transforms/perturb.py

Line 20 in 24244f2

def swap(word):

Remove inner functions
Pythonic swap

Integration Tests for attack.py

I am clueless about how to implement integration tests. Otherwise, I have run the code on datasets for all the four different classification losses on my local repo.

merge shuffle and swap perturbation

the swap perturbation is a special case of the shuffle: merge the two functions with a probability parameter

Implement Metrics

Implement standard metrics from this paper

Edit Distance (Levenshtein)
Jaccard Similarity Coefficient (n-char grams)
Euclidean Distance
Semantic Similarity

Add #pragma: no cover to abstract methods which should never be tested.

See #57 for example

OOP implementation of Bugger

Design and Implement an OOP Structure for our Bugger, which will take a dataset, get queried by the user and generate bugged sentences.

For imp word extractor use a temporary random selector for the time being.

Refer to slack for reference

Add assertions for all transforms

For swap, deletion minimum lengths of 4,3 are necessary, add assertions with appropriate messages

Unit Test Cases

Any project should have unit cases and should work as CI so that any other contributor should not break things that were previously working.

Implementing Attacker class

An attacker should take the pre-trained model, original dataset, adversarial dataset (made with transforms), and a list of criteria to monitor (like accuracy, loss, etc) [All the entities in PyTorch]
The attacker should have a method .attack(), when called, the model shall run inference over each sample from the given dataset and its adversarial counterpart in the adversarial dataset.
Finally giving out things like the performance difference due to the attacks, worst hit attacks, best hit attacks etc (in terms of given criteria)

Sub-W | GloVe implementation

We can find similar words from the pre-trained glove embeddings or word2vec for that matter. We can directly load the file and work upon it or use gensim. @rajaswa and @someshsingh22 , what do you think?

Implementing Keyword Extraction

I'll take in a sentence and return a list of keywords

int2text or text2int

I believe implementing these self-explanatory functions could make for good perturbations. Though, I am not quite sure which category they'd go under. @rajaswa and @someshsingh22, thoughts?
I'll implement this in whichever section required

Typo Try Catch Exceptions

Remove the try catch exceptions from typo error
Reduce the code by using sample

Setup.py Bug in cookie cutter

According to the instructions in CONTRIBUTING.rst from the cookie-cutter template
There is an error in setting the local environment up -

python setup.py
Traceback (most recent call last):
  File "setup.py", line 7, in <module>
    with open("README.rst") as readme_file:
FileNotFoundError: [Errno 2] No such file or directory: 'README.rst'

The issue remains the same for

python setup.py develop
Traceback (most recent call last):
  File "setup.py", line 7, in <module>
    with open("README.rst") as readme_file:
FileNotFoundError: [Errno 2] No such file or directory: 'README.rst'

Implement the 'README.rst'

OOP implementation of metrics.py

Similar to perturbations.py, we need an OOP implementation for metrics.py , with an abstract method .calculate() maybe?

similarity between n-gram vectors (Metric)

I believe one of us could implement the following:

Extract n-grams from both sentences.
Construct respective vectors for both sentences, where a '1' or a '0' would indicate the presence or non-presence of an n-gram.
Calculate cosine similarity.

There's literature where people have used this metric, I can't seem to find as of now. I will update it later.
Note: could also use all (1,2...k) grams to capture more context, comes as the cost of more computation time.

Adding a transforms class [check details]

To add a transforms class (similar to PyTorch transforms)
Should be able to compose various transforms (ex: char perturbations) and apply them to a text sample
An analogy can be drawn:

#Torchvision
transform = transforms.compose([rotate, crop, resize, grayscale])
transformed_image = transfrom.apply(image)

#AdvNLP
transform = transforms.compose([add, swap, delete, visually_similar])
transformed_text = transfrom.apply(text)

Where should the temporal difference functions be integrated?

Updating contribution guidelines

Need to update the contribution guidelines with respect to CI builds.

Writing and passing unit tests before committing
Black formatting

Text Preprocessing

To implement a common black box we need text loading, extraction of words to be attacked, perturbations, distance metrics, models.

Text Loading needs to be very uniform and universal, it should encapsulate all common practices including embedding, tokenizers, batch_loaders, and should support commonly used libraries like nltk, spacy, BERT etc.

We need to think about how we should design this before our first attack.

`visual_similar_chars`

The np.random.choice function might be a better way to do the picking of the random method.

Also, from documentation and usual practice perspective, it is not standard to define docstrings in the middle of a function.

def visual_similar_chars(word, *arg):

    method_pick = np.random.randint(0, len(arg))
    if arg[method_pick] == "unicode":
        """
        get diacritic  characters
        """
        char_array = np.array(list(word))
        diacritic = np.char.add(char_array, u"\u0301")
        return diacritic
    if arg[method_pick] == "visual":
        """
        get visually similar chars. like @ for a. 0 for O.
        """
        return None

Should probably be something like the other functions.

implement typo perturbation

shifts a character by one keyboard space: essentially simulates a typo

Restructuring the tests directory

The test cases need to be restructured according to the source directory:

Separate directory for every subpackage
Separate file for every module

Check GenRL for reference:
https://github.com/SforAiDl/genrl/tree/master/genrl
https://github.com/SforAiDl/genrl/tree/master/tests

Implementation of Repeating Characters

Consider a word like "sales" and we add an extra "a" it becomes "saales". This feature can be added as a enhancement to the current code. Please refer to the following paper for more details : https://arxiv.org/pdf/1905.11268.pdf (check the figure on page 4 for quick reference)

Numpy is used in perturbations.py but not specified in requirements.txt

Add support of Substitute-C (Sub-C)

Substitute-C (Sub-C): Replace characters with visually similar characters (e.g., replacing “o” with “0”, “l” with “1”, “a” with “@”) or adjacent characters in the keyboard (e.g., replacing “m” with “n”).

We can use a pre-compiled dictionary from https://github.com/codebox/homoglyph/blob/master/raw_data/chars.txt

sforaidl / decepticonlp Goto Github PK

decepticonlp's People

Contributors

Stargazers

Watchers

Forkers

decepticonlp's Issues

RESOURCES:

Recommend Projects

Recommend Topics

Recommend Org

Jobs