GithubHelp home page GithubHelp logo

orffinder's Introduction

OrfFinder

A simple Python and tBLASTn-based open reading frame (ORF) finding tool.

Installation

Dependencies

OrfFinder requires python, and is easiest to install using pip and git. It also requires ncbi-blast+ to be installed and available to the system command line.

On Linux, install using your normal package manager, for example:

sudo apt update
sudo apt install ncbi-blast+ python3 python3-pip git

On Windows, download and install ncbi-blast+ using the installer, available at NCBI website.

You can install python, pip and git using winget in either Command Prompt or PowerShell:

winget install Python.Python.3.0; winget install Git.Git
python3 -m pip install -update pip

Alternatively, blast+ can be installed in a conda environment by running

conda install -c bioconda blast

Please see the Conda documentation on how to install conda.

OrfFinder

Install using pip and git:

pip install git+https://github.com/zephyris/orffinder

To reinstall and upgrade use pip and git:

pip install --upgrade --force-reinstall git+https://github.com/zephyris/orffinder

To uninstall use pip

pip uninstall orffinder

Standalone usage

orffinder uses a few simple heuristics for finding ORFs. It starts by finding stop codons, then extends them leftwards finding all possible start codons in frame with that stop codon, until it hits an in-frame stop codon or runs out of sequence. These ORFs are then evaluated using simple properties (length, overlapping other ORFs, position) and tBLASTn against a reference genome collection.

Correct selection of start codon is then evaluated by tBLASTn of each start to start codon fragment of the ORF, extending short segments rightwards to a minimum length. Start to start fragments of the ORFs with abnormally low tBLASTn hits for that protein are trimmed from the start of the ORF.

orffinder has two standard modes of operation:

  1. Finding the best ORF in the forward direction in short sequence, like a transcript.
python -m orffinder <reference_genomes.fasta> <query_transcript.fasta> best

This searches for the longest ORF, the ORF starting closest to the sequence start, and the ORF with the highest tBLASTn score against the reference genome collection. Then, start codons refined by tBLASTn.

  1. Finding all good ORFs in either forward or reverse in a long sequence, like a genome or chromosome.
python -m orffinder <reference_genomes.fasta> <query_genome.fasta> all

This searches for all ORFs, removing short ORFs which overlap a longer ORF, removing ORFs with low tBLASTn score against the reference genome collection. Then, start codons refined by tBLASTn.

Python module usage

You can use orffinder in your Python scripts - however it is subject to change. The OrfFinder class carries out high-level operation. Fasta is used for data ingestion. DnaSequence and Orf are used for ORF finding and filtering. Blast handles tBLASTn searches.

orffinder's People

Contributors

zephyris avatar ulido avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.