GithubHelp home page GithubHelp logo

maitree7 / fraud_detection_sql Goto Github PK

View Code? Open in Web Editor NEW
24.0 1.0 9.0 1.31 MB

Fraud Detection on credit card transations

TSQL 5.37% Jupyter Notebook 94.63%
postgresql erdiagram data-engineering plotly-python

fraud_detection_sql's Introduction

Fraud_Detection_SQL

Credit card fraudster

Credit Card Fraudster by Richard Patterson | Creative Commons Licensed

Background

Fraud is everywhere these days—whether you are a small taco shop or a large international business. While there are emerging technologies that employ machine learning and artificial intelligence to detect fraud, many instances of fraud detection still require strong data analytics to find abnormal charges.

Application of new SQL skills to analyze historical credit card transactions and consumption patterns in order to identify possible fraudulent transactions.

Accomplish three main tasks:

  1. Data Modeling: Define a database model to store the credit card transactions data and create a new PostgreSQL database using your model.

  2. Data Engineering: Create a database schema on PostgreSQL and populate your database from the CSV files provided.

  3. Data Analysis: Analyze the data to identify possible fraudulent transactions.


Files

Query Files

CSV Files

Data Modeling

Create an entity relationship diagram (ERD) by inspecting the provided CSV files.

Note: For the credit_card table, the card column should be a VARCHAR(20) datatype rather than an INT.

Tool used to develop ERD Quick Database Diagrams to create your model.

QuickDBD-export

Data Engineering

Using your database model as a blueprint, create a database schema for each of your tables and relationships. Specify data types, primary keys, foreign keys, and any other constraints you defined.

After creating the database schema, import the data from the corresponding CSV files.

Data Analysis

Now that your data is prepared within the database, it's finally time to identify fraudulent transactions using SQL and Pandas DataFrames.

Top 100 highest transactions during early hours i.e. 7:00 to 9:00 AM

Early_hour

  • Some fraudsters hack a credit card by making several small payments (generally less than $2.00), which are typically ignored by cardholders. Count the transactions that are less than $2.00 per cardholder. Is there any evidence to suggest that a credit card has been hacked? Explain your rationale.

  • What are the top five merchants prone to being hacked using small transactions?

  • Once you have a query that can be reused, create a view for each of the previous queries.

Created a report for fraudulent transactions of some top customers of the firm using Pandas, Plotly Express, hvPlot, and SQLAlchemy to create the visualizations.

  • Fraudulent transactions in the history of two of the most important customers of the firm on the basis of their cardholders' IDs are 18 and 2.

id_holder_2 id_holder_18

  • Observation : The consumption pattern for both the id holder is very different. Id Holder 2 makes too many small transactions. Id Holder 18 has transactions ranging till $1839. Id Holder 2 is more suspectable to fraudulent transactions

  • The CEO of the firm's biggest customer suspects that someone has used her corporate credit card without authorization in the first quarter of 2018 to pay for several expensive restaurant bills. You are asked to find any anomalous transactions during that period.

    • Using Plotly Express, created a series of six box plots, one for each month, in order to identify how many outliers there are per month for cardholder ID 25.

    id_holder_25

    • Observations : There seems to be fraudulent transactions pertaining to Restaurant & Food Truck category where Food Truck is ranging from $1.46 to $1046

Challenge

Another approach to identify fraudulent transactions is to look for outliers in the data. Standard deviation or quartiles are often used to detect outliers.

Identifying Outliers based on Standard Deviation

anomalous_transaction

Identifying Outliers based on Interquartile Range

anomalous_transaction

fraud_detection_sql's People

Contributors

maitree7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

fraud_detection_sql's Issues

transaction and credit card joining issue.

i try to join tractsaction and credit_card tables using join on tractsaction.card = credit_card.card but it didn't return anything, i think there no match in both tables column but you showed connection between tractsaction's card column and credit_card's card column in QuickDBD-exprot.png.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.