GithubHelp home page GithubHelp logo

alberto-paparella / instagramfakeaccountdetection Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 33.64 MB

Repository for the project of the Artificial Intelligence course @unibo, a.y. 2022-23.

Python 100.00%
artificial-intelligence detection fake-account fake-accounts instagram

instagramfakeaccountdetection's Introduction

Instagram Fake Account Detection

This project is a deliverable created for the "Artificial Intelligence" exam from the Alma Mater Studiorum's Master's Degree course in Computer Science.

Introduction

Instagram is one of the most used social networks today, specifically by many young people as well as companies and content creators. However, the large presence of fake accounts puts in danger the experience of all its users, spreading phenomena such as spam, fraud, and fake engagement.

The aim of this project is therefore to give a deep analysis of the set of features useful for the discrimination of these accounts leveraging Machine Learning techniques, with a particular interest in explainability. This has been done starting from two existing proposals from the literature, giving a better set of attributes for both approaches both proposing new features composed by the already existing ones and removing old features considered unnecessary. Various Machine Learning techniques are then compared to give a better understanding of the solution, giving positive results.

Installing and running the project

This section contains instructions to configure and run the project. According to the installation, you may need to refer to python as python3. Python 3 is mandatory for the execution of this project, as is the presence of the venv module.

  1. Either download or clone the project from GitLab.
  2. Open up a terminal inside the project directory.
  3. Create a virtual environment by running the command python -m venv venv. This will create a virtual environment in which the libraries will get installed.
  4. Install libraries with ./venv/bin/pip install -r ./requirements.txt.
  5. Run the script main.py with the command ./venv/bin/python ./main.py.

Once the script is running, follow the instructions on screen. Never run generate_dl_dataset.py, as it's not needed for the demo and will invalidate all the work done on the MLP experiments.

Codebase structure

  • main.py serves as executable script to run the experiments.
  • generate_dl_dataset.py serves as executable script to instantiate or reset deep learning models datasets.
  • utils/utils.py contains many useful functions such as the ones to run the experiments or get the metric scores.
  • dataset/
    • normalizer.py contains a script to create a single dataset from two different ones and export it in .json format.
    • utils.py contains many useful functions to work with the datasets, such as shuffling and splitting and getting combined datasets.
    • deep/ contains all the fixed datasets for multilayer perceptron experiments.
    • sources/ contains the datasets that are being used for the experiments.
      • automatedAccountData.json contains the fake accounts of the InstaFake dataset.
      • nonautomatedAccountData.json contains the real accounts of the InstaFake dataset.
      • user_fake_authentic_2class.csv contains the IJECE dataset.
  • deep/ contains all the multilayer perceptron related functions and models.
    • common.py contains several utility functions for multilayer perceptron.
    • experiment.py contains the main experiment runner for multilayer perceptron.
    • combined/ contains training scripts, model definitions and models for the "combined" datasets.
    • compatible/ contains training scripts, model definitions and models for the "compatible" datasets.
    • IJECE/ contains training scripts, model definitions and models for the "IJECE" datasets.
    • InstaFake/ contains training scripts, model definitions and models for the "InstaFake" datasets.
  • visualization/
    • plotter.py contains many useful functions to plot data and results.
    • plots/ contains the plots for the data analysis on the original datasets.
    • plots_results/ contains the plots representing the result of the experiments.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.