mlh-fellowship / jamspam Goto Github PK
View Code? Open in Web Editor NEWGitHub App to jam the spam PRs on your repo and keep maintainers stress-free (even in Hacktober ๐)
License: MIT License
GitHub App to jam the spam PRs on your repo and keep maintainers stress-free (even in Hacktober ๐)
License: MIT License
Create a list of 50 Spam PRs (links) in the shared Google Sheet
Add spam.csv
to jam-spam-ml/data
jam-spam-app
with ProbotMLH-Fellowship
OrganizationImplement NLP to extract keywords from SPAM and HAM corpus.
A frequency vector of these keywords would be a great feature for our model. To make sure, we have keywords specific to SPAM and HAM characteristics of the PR, we decide to do the following.
N = complexity of the model (starting with 30, might change iteratively to achieve better results)
A = Top N keywords list from SPAM dataset
B = Top N keywords list from HAM dataset
SPAM_KEYWORDS = (A - B)
HAM_KEYWORDS = (B - A)
Suggest using multi-rake for rapid keyword extraction from corpus
Create a list of 50 Ham PRs (links) in the shared Google Sheet
Add ham.csv
to jam-spam-ml/data
Create a script to scrape or possibly use APIs to get attributes of spam and ham PRs in order to build a complete dataset
(Those ticked have been worked on in #12 (WIP) or earlier)
Keep pull request open irrespective of contents if all the below conditions are met (skip spam detection)
documentation
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.