GithubHelp home page GithubHelp logo

icl-ml4csec / deeptective Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 7.0 17.7 MB

A hybrid graph neural network approach for detecting PHP vulnerabilities

License: MIT License

Python 2.92% Jupyter Notebook 14.31% PHP 4.78% CSS 0.74% JavaScript 1.67% GAP 4.27% Java 71.26% Shell 0.02% Batchfile 0.02%

deeptective's Introduction

DeepTective

A Hybrid Graph Neural Network Approach for Detecting PHP Vulnerabilities

DeepTective architecture

Description

This is a repo of the code used for detecting vulnerabilities in php source code.
Namely XSS, SQLi and Command Injection.

Data and data related preprocessing can all be found in the data/ directory

Running ML training can be found in the root directory. Our best model was the combined lstm + cfg which is saved as model_combine.pt.
You can run this on your own code by running run_model.py.
You will have to change the directory variable to point to the directory of the code.

Note that in main_graph_cfg_or_pdg.py, the default is running cfg.
To train on pdg data, you will have to change the names of the pickle files at the top of the file to the relevant one.

Data

Before running anything, please unzip all_data.zip first!

wget --no-check-certificate -O all_data.zip "https://onedrive.live.com/download?cid=15E206B36A9C8AE7&resid=15E206B36A9C8AE7%21300809&authkey=AECPe5YU26hRj_g"

In the data directory, each file generates a certain type.
Below is the prefix of the file along with the data it generates.
The * symbol is one of SARD, NVD or GIT:

  • prepare_dataset_*.py - Sequence of tokens
  • prepare_graph_dataset_*.py - Control Flow Graphs
  • prepare_dependence_graph_dataset_*.py - Program Dependence Graphs
  • prepare_AST_dataset_*.py - Abstract Syntax Trees

Models

Before running any scripts. please unzip all_models.zip first!

wget --no-check-certificate -O all_models.zip "https://onedrive.live.com/download?cid=15E206B36A9C8AE7&resid=15E206B36A9C8AE7%21300810&authkey=AJuxUdOyJDxdtp4"

Unzip all files into the root directory.
3 models will be extracted:

  • model_fileA.pt
  • model_funcA.pt
  • model_combine.pt

File-level

  • To train at file-level, use main_graph_combine_fileLevel.py
  • To use the pretrained model, model_fileA.pt for evaluation, please use the main_graph_combine_fileLevel(eval).py and set the dataset name inside the script.

Citation

Accepted as conference paper (oral presentation) at the IEEE Conference on Dependable and Secure Computing (DSC) 2022. Link to paper: https://ieeexplore.ieee.org/document/9888816

If you refer to our work, please cite our paper as below:

@INPROCEEDINGS{rabheru2021deeptective,

  author={Rabheru, Rishi and Hanif, Hazim and Maffeis, Sergio},
  booktitle={2022 IEEE Conference on Dependable and Secure Computing (DSC)}, 
  title = {DeepTective: Detection of PHP Vulnerabilities Using Hybrid Graph Neural Networks},
  year={2022},
  volume={},
  number={},
  pages={1-8},
  doi={10.1109/DSC54232.2022.9888816}
  
}

deeptective's People

Contributors

hazimhanif avatar

Stargazers

 avatar  avatar Tokarev Igor avatar

Watchers

 avatar Kostas Georgiou avatar

deeptective's Issues

Trouble with dependencies

Hi, I liked your DeepTective presentation.
I tried to start the network on my own. But there're some troubles with dependencies.

Let's try:

git clone [email protected]:ICL-ml4csec/DeepTective.git
cd DeepTective
touch Pipfile
pipenv install -r requirements.txt

And I have an error message:

ERROR:pip.subprocessor:[present-rich] pip subprocess to install build dependencies exited with 1
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/resolver.py", line 645, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/resolver.py", line 612, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve(
[ResolutionFailure]:       ^^^^^^^^
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/resolver.py", line 592, in resolve
[ResolutionFailure]:       return resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 897, in resolve_deps
[ResolutionFailure]:       results, hashes, internal_resolver = actually_resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^^^^^^^^^^
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 670, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File "/usr/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 447, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=str(e))
[pipenv.exceptions.ResolutionFailure]: Warning: Your dependencies could not be resolved. You likely have a mismatch in your sub-dependencies.
  You can use $ pipenv run pip install <requirement_name> to bypass this mechanism, then run $ pipenv graph to inspect the versions actually installed in the virtualenv.
  Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: pip subprocess to install build dependencies exited with 1

Traceback (most recent call last):
  File "/usr/bin/pipenv", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/cli/options.py", line 58, in main
    return super().main(*args, **kwargs, windows_expand_args=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/decorators.py", line 84, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/vendor/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/cli/command.py", line 209, in install
    do_install(
  File "/usr/lib/python3.11/site-packages/pipenv/routines/install.py", line 164, in do_install
    do_init(
  File "/usr/lib/python3.11/site-packages/pipenv/routines/install.py", line 672, in do_init
    do_lock(
  File "/usr/lib/python3.11/site-packages/pipenv/routines/lock.py", line 65, in do_lock
    venv_resolve_deps(
  File "/usr/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 838, in venv_resolve_deps
    c = resolve(cmd, st, project=project)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 707, in resolve
    raise RuntimeError("Failed to lock Pipfile.lock!")
RuntimeError: Failed to lock Pipfile.lock!

Sounds like there is a mistake in requirements.txt file.
Let's try install dependencies in another way:

git clone [email protected]:ICL-ml4csec/DeepTective.git
cd DeepTective
touch Pipfile
pipenv run pip install -r requirements.txt

But now I have the problem too:

  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [59 lines of output]
      Running from numpy source directory.
      <string>:470: UserWarning: Unrecognized setuptools command, proceeding with generating Cython sources and expanding templates
      /tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py:73: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
        required_version = LooseVersion('0.29.21')
      /tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py:75: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
        if LooseVersion(cython_version) < required_version:
      warning: /tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/numpy/__init__.pxd:17:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310
      warning: /tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/numpy/__init__.pxd:17:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See https://github.com/cython/cython/issues/4310

      Error compiling Cython file:
      ------------------------------------------------------------
      ...
          def __init__(self, seed=None):
              BitGenerator.__init__(self, seed)
              self.rng_state.pcg_state = &self.pcg64_random_state

              self._bitgen.state = <void *>&self.rng_state
              self._bitgen.next_uint64 = &pcg64_uint64
                                         ^
      ------------------------------------------------------------

      _pcg64.pyx:113:35: Cannot assign type 'uint64_t (*)(void *) except? -1 nogil' to 'uint64_t (*)(void *) noexcept nogil'. Exception values are incompatible. Suggest adding 'noexcept' to type 'uint64_t (void *) except? -1 nogil'.
      Processing numpy/random/_bounded_integers.pxd.in
      Processing numpy/random/mtrand.pyx
      Processing numpy/random/_pcg64.pyx
      Traceback (most recent call last):
        File "/tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py", line 235, in <module>
          main()
        File "/tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py", line 231, in main
          find_process_files(root_dir)
        File "/tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py", line 222, in find_process_files
          process(root_dir, fromfile, tofile, function, hash_db)
        File "/tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py", line 188, in process
          processor_function(fromfile, tofile)
        File "/tmp/pip-install-beyaur8n/numpy_7485967926394920bf9d941a4447ef40/tools/cythonize.py", line 77, in process_pyx
          subprocess.check_call(
        File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['/home/sd/.venv/bin/python', '-m', 'cython', '-3', '--fast-fail', '-o', '_pcg64.c', '_pcg64.pyx']' returned non-zero exit status 1.
      Cythonizing sources
      Traceback (most recent call last):
        File "/home/sd/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/sd/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/sd/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dy898dio/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 396, in prepare_metadata_for_build_wheel
          self.run_setup()
        File "/tmp/pip-build-env-dy898dio/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 507, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-dy898dio/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 499, in <module>
        File "<string>", line 479, in setup_package
        File "<string>", line 274, in generate_cython
      RuntimeError: Running cythonize failed!
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

What am I doing wrong? Have you seen this problem? How do you launch the project? Is the trouble on my side?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.