GithubHelp home page GithubHelp logo

d4pika / fraud-ad-detection-using-natural-language-processing Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 7.23 MB

This project intends to solve the house hunt problem by sending the updates of new listings as per the selection criteria of the user by filtering spam in housing listings using NLP. It uses SMTP to send emails, nltk for NLP and tkinter for creating UI

Python 100.00%
ui tkinter smtp nlp-machine-learning stacked-ensembles fraud housing-listings craigslist regular-expression

fraud-ad-detection-using-natural-language-processing's Introduction

Fraud-ad-detection-using-Natural-language-processing

Millions of ads are posted every single day on Craigslist worldwide to a large extent anonymously. Millions of housing listings are posted in a month. It is difficult to check the listings for the people looking for a new house. As per trends 6% of housing ads are spams. However, they can’t run all around the world policing and prosecuting people.

This project intends to solve the house hunt problem by sending the updates of new listings as per the selection criteria of the user by filtering spam in housing listings. Classified ad sites routinely process hundreds of thousands to millions of posted ads, and only a small percentage of those may be fraudulent. Online scammers often go through a great amount of effort to make their listings look legitimate. Examples include copying existing advertisements from other services, tunneling through local proxies, and even paying for extra services using stolen account information.

This project would try to provide value to both its client(Craigslist) & its users by solving some of the key issues. High volumes of rental scams damages the reputation of Craigslist & increases its user drop-off rates. Users have to spend hours finding legitimate ads and it takes a lot of time & resources to select a genuine listing from thousands of existing listings.

This project consists of applying data analysis and text analysis concepts & techniques towards the detection of online, classified fraud for housing ad listings and building an automated notification system to send new listings as per user’s search keywords. Traditional data mining is used to extract relevant attributes from an online classified advertisements database and machine learning algorithms are applied to discover patterns and relationships of fraudulent activity. With our proposed approach, we will demonstrate the effectiveness of applying data mining techniques towards the detection of fraud in online classified advertisements for housing ads in major cities.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.