GithubHelp home page GithubHelp logo

equivalent-junctions's Introduction

Equivalent Junctions

This repository contains the python scripts that can be used to extarct equivalent junctions for a genome. The codes are pretty general and can be used for any genome.

Software

  • Python 3.6.1 with Biopython 1.71 installed

Pyhton scripts

getJunctionsFromGTF.py:

  • determines equivalent junctions from a genome fasta file and an annotation file in the GTF format
  • usage:
    python getJunctionsFromGTF.py -f genome_file.fa
                                  -a annotation_file.gtf
                                  -o output_file.txt
    
  • output file:
    • a tab delimited file with an extracted equivalent junction sequence for each annotated exon-exon boundary:
      equiv_junc_sequence    gene_name    chromosome    strand    donor_exon_coordinate    acceptor_exon_coordinate 
  • example equivalent junction in the output file:
                                 AG   LEPR   chr1    +    65420740    65425302  

getJunctionsFromTxt.py:

  • determines equivalent junction sequences from a genome fasta file and annotatiaon file in the BED format containing chr, donor, and acceptor coordinates.
  • usage:
  python getJunctionsFromTxt.py -f genome_file.fa
                                -t annotation_file.txt
                                -o output_file.txt
                                -c choromosome_column (defult=0) 
                                -d donor_coordinate_column (default=1) 
                                -a acceptor_coordinate_column (default=2)
                                -s strand_column (default=3)
  • output file:
    • a tab delimited file with the extracted equivalent junction sequence for each annotated exon-exon boundary:
      equiv_junc_sequence    gene_name    chromosome    strand    donor_exon_coordinate    acceptor_exon_coordinate 

Note: Example genome fasta and annotation files that have been analyzed for equivalent junctions are provided in equiv_junc_genomes.

Computing the number of equivalent junctions:

After running the python script and obtaining the output_file containing the equivalent junction sequence for each exon-exon junction, the following command gives the total number of each equivalent junction sequence:

cut -f1,3,4,5,6 output_file.txt | sort -u |cut -f1 | sort |uniq -c| sort -k1nr > output_file_equivSeqCounts.txt 

equivalent-junctions's People

Contributors

roozbehdn avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.