GithubHelp home page GithubHelp logo

Comments (7)

tseemann avatar tseemann commented on June 12, 2024

It's a good question @crarlus - tblastn could just be a drop in replacement, but there is lots of business logic that assumes DNA coordinates etc.

What database do you use that is protein only?

from abricate.

crarlus avatar crarlus commented on June 12, 2024

My collaborator has a curated list of proteins collected from various resources. Of course I tried (hard) to map them back to the original gene sequences, e.g. via a blast to uniprot and uniparc databases. However I could recover only some but not all of the sequences. So it might be an odd case but at the same time the data reality we live in.

from abricate.

tseemann avatar tseemann commented on June 12, 2024

I have started adding tblastn support but it is extremely slow due to the way I am using the genome as they query... it's not committed yet.

from abricate.

felipelira avatar felipelira commented on June 12, 2024

pipe it with prokka and you can use blastp after prediction. I mean, a brand new version,
ABRICATE+ (including aminoacids databases)

from abricate.

tseemann avatar tseemann commented on June 12, 2024

Prokka relies on Prodigal to detect genes/ORFs, and often misses broken genes, or false frameshifted genes due to bad homopolymer issues with 454/ION/Pacbio/Minion assemblies.

from abricate.

thsyd avatar thsyd commented on June 12, 2024

Hi, thank you so much for sharing ABRicate, @tseemann.
I just want to second the suggestion to improve the protein database option.
I am currently using abricate with a protein database of Pfam families (ca. 13000 protein sequences) to screen putative plasmids for replication protein-sequences. It does take a very long time :)
I'm sure you are aware of e.g. Diamond https://github.com/bbuchfink/diamond, that should work much faster than blast.

from abricate.

tseemann avatar tseemann commented on June 12, 2024

@thysd Diamond only provides blastp (Prot:Prot) and blastx (Prot:DNA)
Unfortunately the design of abricate needs the query to be the contigs, i need tblastn (DNA:Prot)

I don't think Abricate is the best tool for what you want to do. Just running BLAST or MMSeqs2 or DIAMOND directly would make more sense.

from abricate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.