GithubHelp home page GithubHelp logo

xadityax / locality-sensitive-hashing-dna-seqs Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.82 MB

Implementing Locality Sensitive Hashing for DNA Sequences.

Python 62.02% HTML 37.98%
lsh-algorithm locality-sensitive-hashing minhash-lsh-algorithm shingling dna-sequences

locality-sensitive-hashing-dna-seqs's Introduction

Locality-Sensitive-Hashing-DNA-Seqs

Implementing Locality Sensitive Hashing for DNA Sequences.

Preprocessing

DNA sequence dataset from Kaggle is used. Classes of the DNA sequences are seperated from the data and only the sequence is used.

Shingling

Size of shingles is taken as input. 5-10 recommended.

Minhashing

Random permutations of pseudo indices are used to generate signatures for sequences (documents). Number of permutations is taken as input.

LSH

Number of bands is taken as input and sequences in the same band are hashed into buckets where two sequences from the same band have high probability of going into the same bucket if they are similar.

GUI

The GUI is built with Tkinter in python.

How to run

Please enter number of bands, number of permutations and size of shingles before giving corpus directory input to start LSH.

locality-sensitive-hashing-dna-seqs's People

Contributors

xadityax avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.