biocore / burrito-fillings Goto Github PK
View Code? Open in Web Editor NEWApplication controllers for command line bioinformatics applications
License: BSD 3-Clause "New" or "Revised" License
Application controllers for command line bioinformatics applications
License: BSD 3-Clause "New" or "Revised" License
ported from biocore/qiime#1340
When running pick_otus.py with uclust, users often report the following error message on the forum:
Error running uclust. Possible causes are unsupported version (current supported version is v1.2.22) is installed or improperly formatted input file was provided
This error can occur even when uclust is correctly installed and a properly formatted input file is provided (e.g., this has occurred with the Illumina tutorial when running [qiime's] pick_open_reference_otus.py). The uclust application controller needs to be updated to report the original exit code and error message / output of running uclust to aid in debugging issues.
I need to make uclust calls a large number of times in my script, and I get 'OSError: Too many open files' when the dataset size exceeds a certain limit, when I am using brokit.get_clusters_from_fasta_filepath function. However, when I use qiime_system_call instead to call uclust, I don't get an error. I think get_clusters_from_fasta_filepath may not be removing some output files, and I am not sure whether this is related to scikit-bio or brokit. I tried changing suppress_stdout of Uclust class to true, but that did not solve the problem.
Thanks!
@teravest came up with a great brokit acronym, which I think should be added to README.md:
brokit: Biological Research (tool) Obsolescence Kit
Instead of using from tempfile import gettempdir
.
related to biocore/qiime#1319
Because of the PyCogent imports in brokit, we need to covert its license from BSD to GPL.
Currently, the applications controllers used in QIIME are:
Feel free to add any application controller that I missed
The new version of swarm is out and it's pretty different from 1.2.7. This algorithm will be relevant to qiime 2 and EMP.
I'm interested in taking this one, although I'll probably need support.
Ultimately we'll do this for everything, but focusing on what is most needed for QIIME right now. Application controllers corresponding to the the boxes that are checked off here have been updated (if updates were necessary) and the unit tests have passed locally on @gregcaporaso's OS X machine.
Related to biocore/burrito#8. Many of the base classes depend on biocore/burrito#9 to not use /tmp
.
See QIIME's #1501.
Needs to be from skbio.parse.sequence
It would be useful to port over the Vienna package controllers since it is so widely used.
It'd be nice to have bfillings.__version__
so that QIIME can display this info in print_qiime_config.py (see biocore/qiime#1718).
We're going to need to do a release of brokit before QIIME 1.9.0 so QIIME has a release version to depend on.
I think we should change the name at that time to to be something more descriptive. One idea for a new name: burrito-fillings (since it contains burrito derived classes).
Thoughts on this? Ultimately this repo could be the place where we store our application controller derived classes, though we still do need to develop our testing strategy for these.
related to #19 (see comment from @josenavas)
.deb
) fails the brokit unit tests; it turns out that I do have the PyCogent-wrapped version installed on compy though.The function get_clusters_from_fasta_filepath
does not accept a tmp_dir
parameter to specify the temporary directory, and it defaults to /tmp
. Related to biocore/qiime#1515
methods are not documented to standard
See QIIME's #1502. if we could remove the requirement that moltype
be passed into these functions, we'll be able to remove QIIME's dependency on the cogent DNA
moltype. Even if we just had wrappers for these that we could call from QIIME that passed the DNA
moltype, that would be useful (though ugly on the brokit side).
usearch also has the ability to perform local alignment by replacing --usearch_global
with --usearch_local
. I think this should be allowed as it can be used for OTU picking.
https://github.com/biocore/brokit/blob/master/brokit/usearch.py#L1629-L1638
The main issue is that usearch_qf() in usearch.py does not use the output of step "Clustering sequences for error correction" if all chimera detection (de novo/reference) is suppressed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.