GithubHelp home page GithubHelp logo

riyasai22 / bitcoin-transaction-ransomware-detection Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 1.43 MB

Comparitive analysis of hybrid Outlier detection with Machine Learning and Deep Learning Algorithms for Bitcoin Heist Ransomware Detection

Jupyter Notebook 98.82% Python 1.18%

bitcoin-transaction-ransomware-detection's Introduction

Bitcoin-Transaction-Ransomware-Detection

Comparative analysis of hybrid Outlier detection with Machine Learning and Deep Learning Algorithms for Bitcoin Heist Ransomware Detection

The dataset used is from the UCI Machine Learning Repository that contains parsed Bitcoin transaction graphs from 2009 January to 2018 December. The dataset has been curated from three widely adopted studies- Montreal, Princeton and Padua. Using a 24-hour time interval data network transactions have been extracted and the Bitcoin graph is formed. Network edges with less than the threshold minimum of 0.3 bitcoins have been filtered out. This dataset consists of 3000000 observations and 10 attributes to express a specific ransomware transaction pattern.

The comprehensive data analysis and machine learning pipeline, includes exploratory data analysis (EDA), outlier detection, handling unbalanced data, standardization, and classification. Here's a summary of the key steps and techniques used:

Exploratory Data Analysis (EDA):

EDA is conducted using correlation heatmaps to visualize the relationships between variables. Distribution plots (e.g., histograms) and box plots are used to understand the distribution of data and identify potential outliers.

Outlier Detection:

IQR (Interquartile Range):

The IQR method is applied to detect outliers based on the range between the first quartile (Q1) and the third quartile (Q3). Outliers are identified by values outside the calculated range.

Isolation Forest:

The Isolation Forest algorithm is used to detect outliers by isolating them from the rest of the data. It assigns anomaly scores to data points, and those with higher scores are considered outliers.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise):

DBSCAN is employed for clustering data points based on their density. Outliers are points that do not belong to any cluster and are identified accordingly.

Unbalanced Data Handling:

SMOTE (Synthetic Minority Over-sampling Technique) is used to balance the dataset by oversampling the minority class, which can improve classification performance.

Standardization Techniques:

Different data scaling techniques are attempted, including StandardScaler, MinMaxScaler, and RobustScaler, to standardize the features and prepare them for machine learning.

Classification Algorithms:

Various classification algorithms are used to build models:

  • Logistic Regression
  • K-Nearest Neighbors (KNN)
  • Support Vector Machine (SVM)
  • Decision Tree
  • Random Forest
  • AdaBoost
  • XGBoost
  • Neural Networks

Evaluation Metrics:

Model performance is evaluated using standard classification metrics, including:

  • Accuracy: Measures the overall correctness of predictions.
  • Precision: Measures the percentage of true positive predictions among positive predictions.
  • Recall: Measures the percentage of true positives correctly identified.
  • F1 Score: A balance between precision and recall, useful for imbalanced datasets.
  • ROC (Receiver Operating Characteristic) and AUC (Area Under the Curve): Measure the model's ability to distinguish between classes, particularly useful for binary classification.

bitcoin-transaction-ransomware-detection's People

Contributors

riyasai22 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.