GithubHelp home page GithubHelp logo

uncensor's Introduction

Uncensor*

This module is created for the purposes of unmasking censored strings such as "f**k".

But Why?

In our web-tracking tasks, we often come across statements like "That C.E.O is a p***k!". Now if you have to run sentiment analysis on this post, or even for the purposes of appropriately saving it in a full text data-store (we love elasticsearch), you must first decode what p***k stands for. This is what we call "Uncensoring"!

I'm sure there are many other use cases for this. Now that a divisive U.S. election has churned out a lot of curse words into the interwebs!

Enough Politics. Let's Dive In!

It is easy to use uncensor. Install from npm npm install --save uncensor

const uncensor = require('uncensor');

var masked = "f**k";
var unmasked = uncensor.unmask(masked);

console.log(unmasked);

This prints out:

{
    "censored": "f**k",
    "results": {
        "word": {
            "profanity": "fuck",
            "popularity": 9
        },
        "other_words": [
            {
                "profanity": "fook",
                "popularity": 0
            },
            {
                "profanity": "feck",
                "popularity": 0
            }
        ],
        "meta": {
            "count": 3,
            "steps": "Length Check > Start Letter Match > Last Letter Match > Levenshtein Ordering [3 words]"
        }
    }
}

Note that results include a meta object that indicates the steps taken to arrive at results presented.

  • Length Check : results filtered by length of mask.

  • Start Letter Match & Last Letter Match : masked words usually indicate the start & last letters. So we further filter the results by those letters.

  • Levenshtein Ordering : We then use levenshtein distance & profanity popularity to sort out results where multiple results are returned.

Dealing With Phrases

You can also unmask entire phrases.

const uncensor = require('uncensor');

var masked_phrase = "That guy is such a p***y. Hate the m*****fckuer!";
var unmasked_phrase = uncensor.unmask_phrase(masked_phrase);

console.log(unmasked_phrase);

//PRINTS: That guy is such a pussy. Hate the motherfucker!

Run the Tests...

You can run tests folder for some of the tests.

uncensor's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.