GithubHelp home page GithubHelp logo

osssanitizer / osspolice Goto Github PK

View Code? Open in Web Editor NEW
65.0 10.0 30.0 209.16 MB

Identifying Open-Source License Violation and 1-day Security Risk at Large Scale

License: GNU General Public License v3.0

Python 70.95% Shell 1.06% Java 9.23% Makefile 0.22% C 0.15% HTML 18.05% Gnuplot 0.27% JavaScript 0.07%
application-security license-checking code-clones

osspolice's Introduction

Introduction

This is the open source repository for OSSPolice presented in paper: Identifying Open-Source License Violation and 1-day Security Risk at Large Scale [pdf]. For technical details, please refer to the paper.

The project consists of four components: osspolice, redis, postgresql and rabbitmq. For quick usage, please skip to the Usage section.

  1. osspolice is the main code base for this project. It can be configured using main/config, a template with explanation on the config options is available in main/config.tmpl. osspolice is used to index new C/C++ repos and Java artifacts, as well as query apk/jar/so/dex against the indexed database to find reused Open Source Software (OSS) and their versions.
    • For feature extractor, java parser is located in main/java_parser, and native parser is located in main/native_parser. The former is open source and the latter is not because (a) there are many good alternatives for native (C/C++) parser, such as ctags, gtags etc, (b) we have other works in progress. In the native parser, you can easily adapt the current code to use other parsers.
  2. rabbitmq is the scheduler/broker for distributed deployment. The option CELERY_BROKER_URL in osspolice should be set to the link of the scheduler.
  3. redis is the in-memory database used for indexing and searching. The option NATIVE_NODES, NATIVE_VERSION_NODES, JAVA_NODES, JAVA_VERSION_NODES and RESULT_NODES in osspolice should be set to the setup of redis database. Since there is no key collision between all these databases, they can be merged together.
    • NATIVE_NODES are prefixed with str-, func-, var-, file-, dir-, branch-, repo-. Reverse mapping replaces '-' with '_'.
    • JAVA_NODES are prefixed with strings-, classes-, normclasses-, centroids-, files-, dirs-, repo-. Reverse mapping prepends 'r-'.
    • NATIVE_VERSION_NODES and JAVA_VERSION_NODES are prefixed with software-, softwareversion-.
  4. postgresql is the database used for storing repo and artifact information. The option NATIVE_DBS and JAVA_DBS in osspolice should be set to the setup of postgresql database.

Dependencies

docker, docker-compose

Usage

If you are simply interested in testing the tool for your app, we are working on an online service. Please check back later!

If you are interested in building your own hierarchical indexing database, prebuilt postgresql databases are provided. You can load them using postgresql/load_data.sh. List of repos/artifacts used in the paper are also provided in the data folder. You can use them to build your own database.

If you are interested in comparing with our tool, we also have a prebuilt indexing database available. Please shoot us an email at [email protected] for how to set this database up.

  1. start rabbitmq scheduler
    • cd rabbitmq && docker-compose up
  2. start redis database
    • cd redis && docker-compose up
  3. start postgresql database and load data
    • In one terminal, cd postgresql && docker-compose up
    • After postgresql starts, in another terminal, ./load_data.sh
  4. start osspolice
    • customized your main/config from main/config.tmpl, point the broker to rabbitmq, redis cluster to redis databases, and postgresql to postgresql database.
    • start osspolice worker
      • docker-compose up
    • start osspolice master
      • Start osspolice interactively, docker-compose run osspolice /bin/bash
      • Add jobs to broker, python detector.py apk_search $PATH_TO_APK

Helper Scripts

  • Create GitHub accounts automatically. This script exploits the fact that GitHub accounts can be created with invalid email address.
    • main/create-github-account.py
  • Check the status of redis database. This script prints the status of indexed Native and Java database.
    • main/redis_check.py

TODO

  1. Support iOS and Windows app binaries
  2. Robustness of native_parser and java_parser
  3. Add support for Python, JS etc.

License

This software is licensed under GPL-3.0. Please check the terms and restrictions at https://www.gnu.org/licenses/gpl-3.0.html.

osspolice's People

Contributors

lingfennan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osspolice's Issues

Problem installing Redis

Thank you, this is a very cool tool! I was trying to use it and ran into some trouble installing. Can you please tell me what I'm doing wrong?

Here's my information:

System information: MacOS 12.4 with Apple M1 Pro chip
Docker version: 20.10.17, build 100c701

This is the command I ran that generated an error:

$ cd redis && docker-compose up

This is the error I got:

=> [ 8/18] RUN tar -zxf ruby-2.2.2.tar.gz && cd ruby-2.2.2 && ./configure && make && make install                 165.6s
 => ERROR [ 9/18] RUN gem install redis                                                                              0.4s
------
 > [ 9/18] RUN gem install redis:
#13 0.348 ERROR:  While executing gem ... (Gem::Exception)
#13 0.348     Unable to require openssl, install OpenSSL and rebuild ruby (preferred) or use non-HTTPS sources
------
executor failed running [/bin/sh -c ge
ERROR: Service 'redis-cluster' failed to build : Build failed

Any advice or thoughts?
Thank you.

Can't run your OSSPolice

I managed to setup your osspolice, but still can't run anything on it. Do I have to download the repos by myself ? Because I didn't see the code of controller that manage that part and when I run command, it told me that redis is empty.

    Also, I wanted to know the specific meanings of every command under /main and what they do, it has been really confusing me even if I took a peek on your paper.

    Please give me a list of steps that I could follow to run with at least one result.

    Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.