GithubHelp home page GithubHelp logo

hostile-url-detector's Introduction

Hostile-URL-Detector (HUD): a multi-staged URL analysis tool to provide useful insights about an input URL

  • With the massive strides in digital technology and popularization of internet, data security and privacy comes under an alarming risk of being violated. Several attacks like phishing, defacement and malware have been a constant threat to the data of not only big corporations but also an internet using individual. These attacks are mostly motivated by financial gains like in the case of several phishing attacks where the link/URL (Uniform Resource Locator) may redirect the user’s data to a different hidden URL causing unwanted harm.

  • In this project we refer anything harmful whether financial, social or mental as hostile. We have proposed a methodology to detect or classify an input URL into one of the attack types and provide the user with more information about the URL they are to visit. Here, we have made large set of labelled links (1,89,780 URLs- which is a combination of URLs taken from various sources and contains the types – malware, malicious, adult, phishing, defacement and spam) on which we perform lexical feature extraction to gain more information about the link (to extract literary features) and various novel features which we have proposed are also extracted.

  • In total, we extracted 62 features and gained the highest predictive accuracy of 99.54% using random forest classifier. Other algorithms used were Gaussian Naive Bayes, Decision Tree Classifier, Multi Layer Perceptron Classifier and K-Nearest Neighbors.

About the code

  • Python was used to code the entire project as:
    • It gives easy access to supervised machine learning algorithms with sklean and libraries like numpy and pandas.
    • Provides tkinter to make simple GUI.
    • Easy to code and simple to understand data structures and computing logic.

NOTE:

  • Only the GUI part has been uploaded which makes the application non functional, so, only the GUI will execute.
  • Install sklearn, tkinter, themedtk and any other requirement console asks.
  • Install Audiowide font.

Screenshots

Architecture

Architecture

Feature Extractor

Feature Extractor

Home tab

Screenshot (188)

Giving "google.com" as input and pressing Enter key

Screenshot (183)

Giving adult URL as input (Stage-1)

Screenshot (184)

Giving adult URL as input (Stage-2)

Screenshot (185)

Performance display tab (Modelwise predictive accuracy for each URL type)

Screenshot (186)

Dependency checker tab

Screenshot (187)

History tab

Screenshot (176)

hostile-url-detector's People

Contributors

thesaurabhkushwaha avatar roshanshanbhogue avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.