Comments (7)
Hey @lexikazen , the bad news is that this looks like a bug with the coverage code. It would help me to fix it if you would be willing to send over some data so that I can reproduce the issue on my computer. But I'll take a look at the code regardless and see if I can figure out what is wrong.
The good news is that anvi-compute-metabolic-enrichment
doesn't need the coverage data (and even if its there, it won't do anything with it anyway), so you can simply re-run anvi-estimate-metabolism
without the --add-coverage
flag, and you would get the output you need for the enrichment test.
from anvio.
I am not sure why this is happening from the output alone, but the only way there is a gene caller id, g
(recognized in contigs-db), that is missing in self.profile_db.gene_level_coverage_stats_dict
, is that the coverages were recovered for genes known to profile-db (i.e., because there was a cutoff that discarded some contigs from profiling or there was a bin that only focused on a subset of genes in the contigs-db) and somewhere in the upstream someone doesn't know about it. Constraining the gene calls to the known universe of genes by the profile-db are handled internally when the profile-db is initialized with a collection/bin, but maybe there is a bug somewhere when it is used through anvi-estimate-metabolism
. Just mentioning these as self notes :)
We can probably reproduce this error by creating a collection for any metagenome bins in which do not include all contigs in the contigs-db.
PS: over 1 million genes -- that's a nice dataset, @lexikazen :)
from anvio.
@ivagljiva I tried to attach my profile db and contigs db, but Github won't let me upload them. How should I share those files with you?
from anvio.
Actually, it is okay @lexikazen :) I managed to find a dataset that reproduces the error. It works for me when specifying a collection without metagenome mode, but breaks when combining a collection with metagenome mode. I guess that is a test I forgot to do when developing :p I had assumed that people with collections would want to treat each bin in the collection as an individual genome rather than using the collection to split out a subset of contigs to estimate on individually, but clearly there is a need for the latter. So thank you very much for finding this edge case! I will use my test data to fix the bug :)
from anvio.
Actually, it is okay @lexikazen :) I managed to find a dataset that reproduces the error. It works for me when specifying a collection without metagenome mode, but breaks when combining a collection with metagenome mode. I guess that is a test I forgot to do when developing :p I had assumed that people with collections would want to treat each bin in the collection as an individual genome rather than using the collection to split out a subset of contigs to estimate on individually, but clearly there is a need for the latter. So thank you very much for finding this edge case! I will use my test data to fix the bug :)
Great thank you! :)
from anvio.
I managed to fix the bug :) Turns out we were loading gene calls from all splits in the DBs even when a collection name was passed, and this conflicted downstream when the gene coverages were loaded just for the collection.
The PR #2242 addresses the issue in anvio-dev
. @lexikazen , if you want, you could install the development branch and try your command again in that environment, and it should work :)
from anvio.
Thank you very much, Iva! :)
from anvio.
Related Issues (20)
- [BUG] Nucleotide and amino acid not aligned in the inspect page HOT 3
- [REPORT] Updating anvi'o SCG taxonomy databases with new genes
- [TECHNICAL ISSUE] No SNVs when using anvi-inspect HOT 1
- anvio thinks the python version is different than the one installed HOT 3
- [BUG] Failed building wheel for datrie HOT 6
- [BUG] anvi-dereplicate-genomes python? error HOT 1
- [BUG] Incorrect (inflated) stepwise copy number for modules with complex parenthetical clauses HOT 1
- external gene call file created from more than two contigs HOT 5
- [BUG] anvi-estimate-scg-taxonomy crashes when external-genomes file contains contigs-db with old scg names HOT 8
- [BUG] anvi-pan-genome DIAMOND expected output files are missing
- [BUG] Memory issues with `anvi-estimate-metabolism` on large metagenomes in `--metagenome-mode` HOT 4
- [DISCUSSION] Common misconceptions and mistakes for anvi'o beginners that we should clarify in a blog post HOT 5
- [BUG] 'NumGenomesEstimator' sometimes double-counts some genomes as both Bacteria and Archaea HOT 8
- [BUG] Eggnog mapper not running in workflow, works fine manually in same conda env HOT 1
- [Bug] external-genomes.txt missing genomes HOT 8
- [BUG] Config Error: Drivers::Muscle: anvio-8 HOT 7
- [BUG] Parent layer bug HOT 1
- [BUG] anvi-setup-ncbi-cogs urlopen error HOT 6
- [FEATURE REQUEST] Interface option to keep contigs together when binning HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anvio.