GithubHelp home page GithubHelp logo

Comments (8)

gamcil avatar gamcil commented on August 25, 2024

Sorry, I've never used PROKKA before. How are the files different? Might this be related to #4?

from clinker.

marade avatar marade commented on August 25, 2024

Quite possibly it is related to #4. PROKKA is probably the most widely used quick-annotation program right now, so I reckon many people will want this. It appears PROKKA uses BioPerl to generate the files, e.g.

https://metacpan.org/pod/Bio::DB::GenBank

It's really quite easy to run PROKKA and generate them yourself. For your convenience I've attached a GenBank file generated by PROKKA, which I ran on Pseudomonas Aeruginosa PAO1, though this is not ideal since it's only one contig and it's named '1'.
PAO1.zip

Note as well the comments about the GenBank format on the PROKKA home page. Thanks much!

from clinker.

gamcil avatar gamcil commented on August 25, 2024

Could you give some more info about the error you were running into? The file you uploaded seems to load in fine on my end

from clinker.

marade avatar marade commented on August 25, 2024

The problem appeared to arise from the contig names generated by a SPades genome assembly and then annotated by PROKKA, where clinker would choke on the first (LOCUS) line of the GenBank file, e.g.

LOCUS NODE_1_length_395402_cov_27.667845395402 bp DNA linear

from clinker.

gamcil avatar gamcil commented on August 25, 2024

Okay this is definitely BioPython's GenBank parser not being able to parse long locus names, as you said. Unfortunately, there doesn't seem to be a way to get around it since they explicitly count columns when parsing the LOCUS line (i.e. maximum 16 characters for that field unless stealing from the length field, discussed here: biopython/biopython#747).

Unless I can get around to completely switching from the BioPython parser to something else, I don't think there's much I can do about this I'm afraid. In the meantime, could you try the --centre flag in PROKKA to rename your contigs to be NCBI compliant (as mentioned in the PROKKA readme), then run clinker again?

from clinker.

marade avatar marade commented on August 25, 2024

I'll try this when I get a chance, though if #10 gets solved this will no longer matter to me, since I try to avoid GenBank format whenever possible.

from clinker.

marade avatar marade commented on August 25, 2024

The good news is using the --compliant switch for PROKKA apparently allows the script to continue beyond where it would previously crash, but see #21 mentioned above.

from clinker.

gamcil avatar gamcil commented on August 25, 2024

Will close this one too since the PROKKA flag works and GFF support has been added with v0.0.10.

from clinker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.