GithubHelp home page GithubHelp logo

osll / code-plagiarism Goto Github PK

View Code? Open in Web Editor NEW
9.0 4.0 1.0 776 KB

Program for finding plagiarism in the source code written on Python3, C, C++ based on compare AST metadata.

License: MIT License

C++ 1.21% Python 90.80% Makefile 4.21% templ 3.78%
python plagiarism-detection code-similarity plagiarism-checker education

code-plagiarism's Introduction

Code Plagiarism Analysis

1. Install

1.1 Manual installation on the local system from sources

First of all, clone the repository and moved into this.

sudo apt install git # if not installed
git clone https://github.com/OSLL/code-plagiarism.git
cd code-plagiarism/
  • OS Ubuntu Linux == 22.04

  • Python version == 3.10

  • Run these commands:

    sudo apt update
    sudo apt install python3 python3-pip
    sudo apt install clang libncurses5
    
    # Optional
    sudo apt install python3-venv
    pip3 install virtualenv
    python3 -m venv venv
    source venv/bin/activate
    
    pip3 install -U pip # pip3 version >= 19.0
    pip3 install argparse-manpage==3 requests==2.31.0
    pip3 install --upgrade setuptools # Ensure that an up-to-date version of setuptools is installed
    make
    

1.2 Build and run local Docker container

  • Create a code-plagiarism docker image

    $ make docker-image
    
  • Rebuild created code-plagiarism docker image

    $ make docker-image REBUILD=1
    
  • Run created a code-plagiarism container

    $ make docker-run
    
  • Show help information about other make commands

    $ make help
    

1.3 Pull the Docker Image from Docker Hub

  • Pull an image from Docker Hub

    $ docker pull artanias/codeplag-ubuntu22.04:latest
    
  • Run container based on pulled image and connect volume with your data

    The docker image has volume '/usr/src/works' which is the directory with your data.

    $ docker run --rm --tty --interactive --volume <absolute_local_path_with_data>:/usr/src/works "artanias/codeplag-ubuntu22.04:latest" /bin/bash
    

1.4 Install with package manager apt-get

  • For this purpose, you need to get installing package from releases tab with extension .deb;
  • The next step is run command on the target system:
    $ sudo apt-get install <path_to_the_package>/<package_name>.deb
    

2. Tests

2.1. Pre-commit

  • Check code with linters, format code, and check used types with pre-commit.

    # Before local checking, you need to install dependencies into your virtual environment.
    $ python3 -m pip install --requirement docs/notebooks/requirements.txt
    $ python3 -m pip install $(python3 -m setup.py --build-requirements)
    $ python3 -m pip install $(python3 -m setup.py --install-requirements)
    $ make pre-commit
    
  • Also, before committing, you need to install pre-commit hooks in the repository.

    $ pre-commit install
    

2.2. Unit tests

  • Testing for analyzers with pytest lib (required preinstalled pytest framework).
    $ pip3 install pytest==7.4.0 pytest-mock==3.11.1
    $ make test
    

2.3. Auto tests

  • Testing work of the util with written autotests (required installed util and 'ACCESS_TOKEN' with empty accesses, look ahead).
    $ make autotest
    

3. Work with codeplagcli

Before starting work with searching on GitHub, you may define variable ACCESS_TOKEN in file .env in the folder from which you want to run the app:

ACCESS_TOKEN - Personal access token which add more requests to repos and access to private repos if you give it.

For beginning, you may to call help for getting information about available CLI options

$ codeplag --help

For getting more information about CLI run after make or in a docker container

$ man codeplag

When using bash as your shell, codeplag can use argcomplete for auto-completion. For permanent completion activation, use:

$ register-python-argcomplete codeplag >> ~/.bashrc

4. Demo examples (works in the project directory and with an installed codeplag package)

  • Show help: $ codeplag --help
  • Show help of subcommands (and further along the chain similarly): $ codeplag check --help
  • Setting up the util:
    # Setup check threshold to 70
    # Language to English
    # Show check progress
    # Extension of reports 'csv'
    # Reports path to '/usr/src/works'
    # Path to environment variables '/usr/src/works/.env'
    $ codeplag settings modify --threshold 70 --language en --show_progress 1 --reports_extension csv --reports /usr/src/works --environment /usr/src/works/.env
    
  • Python analyzer:
    $ codeplag check --extension py --files src/codeplag/pyplag/astwalkers.py --directories src/codeplag/pyplag
    $ codeplag check --extension py --directories src/codeplag/algorithms src
    $ codeplag check --extension py --files src/codeplag/pyplag/astwalkers.py --github-user OSLL --repo-regexp code- --all-branches
    $ codeplag check --extension py --github-files https://github.com/OSLL/code-plagiarism/blob/main/src/codeplag/pyplag/utils.py --github-user OSLL --repo-regexp code- --all-branches
    $ codeplag check --extension py --github-files https://github.com/OSLL/code-plagiarism/blob/main/src/codeplag/pyplag/utils.py --directories src/codeplag/pyplag/
    $ codeplag check --extension py --directories src/ --github-user OSLL --repo-regexp code-
    $ codeplag check --extension py --github-project-folders https://github.com/OSLL/code-plagiarism/blob/main/src/codeplag/pyplag --github-user OSLL --repo-regexp code-
    $ codeplag check --extension py --github-project-folders https://github.com/OSLL/code-plagiarism/blob/main/src/codeplag/pyplag --directories src/codeplag/pyplag/
    
  • C++/C analyzer:
    $ codeplag check --extension cpp --directories src/codeplag/cplag/tests/data src/ --files test/codeplag/cplag/data/sample1.cpp test/codeplag/cplag/data/sample2.cpp
    $ codeplag check --extension cpp --github-files https://github.com/OSLL/code-plagiarism/blob/main/test/codeplag/cplag/data/sample3.cpp https://github.com/OSLL/code-plagiarism/blob/main/test/codeplag/cplag/data/sample4.cpp
    $ codeplag check --extension cpp --github-project-folders https://github.com/OSLL/code-plagiarism/tree/main/test
    $ codeplag check --extension cpp --github-user OSLL --repo-regexp "code-plag"
    
  • Create html report: codeplag report create --path /usr/src/works

code-plagiarism's People

Contributors

artanias avatar burnedscarecrow avatar dmtnikolaev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

muddi900

code-plagiarism's Issues

Добавить возможность передавать разные аргументы в cplag, кроме как по умолчанию.

Возможно будет необходимость корректировать --std=c++11 на более свежие версии языка, либо добавлять дополнительные параметры, для поддержки сторонних библиотек.

args = '-x c++ --std=c++11'.split()

Схожая проблема с кодеками, которая была в GitHubParser может возникать и для режима проверки локально.

~/.local/lib/python3.8/site-packages/codeplag/pyplag/utils.py in get_ast_from_filename(filename)
     59     try:
     60         with open(filename) as f:
---> 61             tree = get_ast_from_content(f.read(), filename)
     62     except PermissionError:
     63         print("File denied.")

/usr/lib/python3.8/codecs.py in decode(self, input, final)
    320         # decode input (taking the buffer into account)
    321         data = self.buffer + input
--> 322         (result, consumed) = self._buffer_decode(data, self.errors, final)
    323         # keep undecoded input until the next call
    324         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 12: invalid continuation byte

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.