GithubHelp home page GithubHelp logo

beyondacm / smartembed Goto Github PK

View Code? Open in Web Editor NEW
98.0 8.0 32.0 2.72 MB

A Tool for clone detection and bug detection in smart contracts

Python 6.83% CSS 0.94% JavaScript 49.22% HTML 2.51% ANTLR 1.23% Shell 0.08% Java 37.98% Solidity 1.21%
smart-contracts ethereum code-clones bug-detection code-embedding clones

smartembed's Introduction

SmartEmbed Web Tool

SmartEmbed is a web service tool for clone detection & bug detection for smart contracts. We have newly added the interface of using smartembed for estimating similarities between different smart contracts, please feel free to have a try! :) Any questions and feedback are very welcome.

Our full research paper: Checking smart contracts with structural code embeddings has been published on TSE (IEEE Transactions on Software Engineering), we describe the details for clone detection and bug detection in smart contracrs using SmartEmbed, for more details please refer to our research paper:
https://ieeexplore.ieee.org/document/8979435
https://arxiv.org/abs/2001.07125

SmartEmbed has been published as a tool demo by on ICSME-2019, for details of the implementation please refer to our paper:
https://arxiv.org/abs/1908.08615

Our work: When Deep Learning Meets Smart Contracts has been accepted by ASE-2020 Student Research Competition track, for more details please refer to our paper:
http://arxiv.org/abs/2008.04093

We have published our tool through the following url:
http://www.smartembed.tools/

There is a tutorial video introducing how to use SmartEmbed on Youtube:
https://youtu.be/o9ylyOpYFq8

Source data can be downloaded from:
https://drive.google.com/file/d/13iTTpt7gFd9wEW35C2fX4pVT7cVlHgxi/view?usp=sharing

Please cite our work if you found our work is helpful:
Checking smart contracts with structural code embeddings:

@article{gao2020checking,
title={Checking Smart Contracts with Structural Code Embedding},
author={Gao, Zhipeng and Jiang, Lingxiao and Xia, Xin and Lo, David and Grundy, John},
journal={IEEE Transactions on Software Engineering}, year={2020},
publisher={IEEE}
}

Smartembed: A tool for clone and bug detection in smart contracts through structural code embedding:

@inproceedings{gao2019smartembed,
title={Smartembed: A tool for clone and bug detection in smart contracts through structural code embedding},
author={Gao, Zhipeng and Jayasundara, Vinoj and Jiang, Lingxiao and Xia, Xin and Lo, David and Grundy, John},
booktitle={2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
pages={394--397},
year={2019},
organization={IEEE}
}

When Deep Learning Meets Smart Contracts

@article{gao2020deep,
title={When Deep Learning Meets Smart Contracts},
author={Gao, Zhipeng},
journal={arXiv preprint arXiv:2008.04093},
year={2020}
}

Introduction

This folder contains the code for the SmartEmbed web tool. There are a few important subfolders and files as follows.

  • templates - contains the frontend html files
  • static - contains the css files and js scripts
  • app[dot]py - main flask file, see below for usage.
  • similarity[dot]py and smart_embed[dot]py - Contains the backend codes for clone detection.
  • bug[dot]py and smart_bug[dot]py - Contains the backend codes for bug detection.

Pretraied Models

We have released the pre-trained model as described in the paper. You can use the following command to download our pretrained model:

pip install gdown
gdown https://drive.google.com/uc?id=1-LKJTZakqd8ntKzqVNtQZUgdZnFoYtpK
unzip Contract_Embedding.zip
cp -r Embedding/ SmartEmbed/contract_level/
pip install gdown  
gdown https://drive.google.com/uc?id=1lbaQVtZbNuEEjHIWVnwLqGvILxNWwtZW  
unzip Contract_Model.zip  
mv Model SmartEmbed/contract_level/
pip install gdown  
gdown https://drive.google.com/uc?id=18GiDgSwoRjPC25d2Vp15oi_xH2NivyXH  
unzip Statement_Model.zip 
mv Model SmartEmbed/statement_level/  

SmartEmbed Web Tool Setup and Usage

  1. Install requirements.txt with pip install -r requirements.txt.
  2. Clone this project to your local git clone https://github.com/beyondacm/SmartEmbed.git.
  3. Please download the pretrained model with the aforementioned shell scripts.
  4. Change directory to cd SmartEmbed/todo/, and Run the command python app.py . This will initialize the web tool at localhost:9000, as illustrated below. image
  5. Paste the smart contract on to the text area and hit Submit.
  6. Clone detection results will be displayed as follows. image
  7. Bug detection results will be displayed as follows. image

SmartEmbed Interface Usage

You can easily use smartembed tool to estimate the similarity between two smart contracts, the following code snippet gives an example:

from smartembed import SmartEmbed

se = SmartEmbed()
# read contract1 from file
contract1 = open('./todo/test.sol', 'r').read() 
# get vector representation for contract1
vector1 = se.get_vector(contract1)
# read contract2 from file
contract2 = open('./todo/KOTH.sol', 'r').read()
# get vector representation for contract2
vector2 = se.get_vector(contract2)
# estimate similarity between contract1 and contract2 
similarity = se.get_similarity(vector1, vector2)
print("similarity between c1 and c2:", similarity)

Contact

[email protected]
[email protected]
Discussions, suggestions and questions are welcome!

​ ​ ​

smartembed's People

Contributors

beyondacm avatar vinojjayasundara avatar zpgao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smartembed's Issues

build and run locally

I have download the pretrained models and put them in the right directories, however when try to run the program as the procedures in README.md, I failed when I run the command python app.py. I find maybe becuase of the inconsistent of python versions, I want to know the solution of this.

Function level tokenizer

Hi, great repo. Could you share the Normalizer and Java Parser code for function_level to regenerate token list similar to your function_normalized_tokens and function_tokens in the original processed dataset ?

Models Incompatible with Gensim-3.8.x and later

I am trying to run your tool to try it out, but I am running into an issue with gensim when I try to run the app in todo/app.py​. My gemsim package version is gemsim-3.2.0, and I get this error:

TypeError: Pre-gensim-3.8.x fastText models with nonstandard hashing are no longer compatible. Loading your old model into gensim-3.8.3 & re-saving may create a model compatible with gensim 4.x.

It looks like your pre-trained models used an older version of gemsim. I tried rolling back my gemsim version but then I just run into other compatibility issues. Have you given any though to trying to update your pre-trained models to work with the latest version of gemsim? Or do you have any other suggestions about how to get around this issue?

node definition

Sorry to bother you, in your paper, you have mentioned 308 node types,but how can we know this?Does 308 mean ast node types or solidity grammar rules?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.