qiita-spots / qiita Goto Github PK

View Code? Open in Web Editor NEW

118.0 21.0 79.0 46.28 MB

Qiita - A multi-omics databasing effort

Home Page: http://qiita.microbio.me

License: BSD 3-Clause "New" or "Revised" License

Python 47.07% HTML 17.22% JavaScript 1.66% Shell 0.05% CSS 0.04% Makefile 0.12% PLpgSQL 1.52% Jupyter Notebook 32.31%

microbiome microbiome-analysis python

qiita's Introduction

Qiita (canonically pronounced cheetah)

Advances in sequencing, proteomics, transcriptomics, metabolomics, and others are giving us new insights into the microbial world and dramatically improving our ability to understand their community composition and function at high resolution. These new technologies are generating vast amounts of data, even from a single study or sample, leading to challenges in storage, representation, analysis, and integration of the disparate data types. Qiita was designed to allow users address these new challenges by keeping track of multiple studies with multiple 'omics data. Additionally, Qiita is capable of supporting multiple analytical pipelines through a 3rd-party plugin system, allowing the user to have a single entry point for all their analyses. Qiita's main site provides database and compute resources to the global community, alleviating the technical burdens, such as familiarity with the command line or access to compute power, that are typically limiting for researchers studying microbial ecology.

Qiita is currently in production/stable status. We are very open to community contributions and feedback. If you're interested in contributing to Qiita, see CONTRIBUTING.md. If you'd like to report bugs or request features, you can do that in the Qiita issue tracker.

To install and configure your own Qiita server, see INSTALL.md. However, Qiita is not designed to be used locally but rather on a server, we therefore advise against installing your own version on a personal computer. Nevertheless, it can run just fine on a laptop or small computer for development and educational purposes. For example, for every single PR and release, we install Qiita from scratch as GitHub Actions, you can follow these steps.

For more specific details about Qiita's philosophy and design visit the Qiita main site tutorial.

Current features

Full study management: Create, delete, update samples in the sample and multiple preparation information files.
Upload files via direct drag & drop from the web interface or via scp from any server that allows these connections.
Study privacy management: Sandboxed -> Private -> Public.
Easy long-term sequence data deposition to the European Nucleotide Archive (ENA), part of the European Bioinformatics Institute (EBI) for private and public studies.
Raw data processing for Target Gene, Metagenomic, Metabolomic, Genome Isolates and BIOM files. NOTE: BIOM files can be added as new preparation files for downstream analyses; however, this cannot be made public in the system.
Basic downstream analyses using QIIME 2. Note that Qiita produces qza/qzv in the analytical steps but you can also convert non QIIME 2 artifacts.
Bulk download of studies and artifacts.
Basic study search in the study listing page.
Complex metadata search via redbiom.

For more detailed information visit the Qiita tutorial and the Qiita help.

Accepted raw files

Multiplexed SFF
Multiplexed FASTQ: forward, reverse (optional), and barcodes
Per sample FASTQ: forward and reverse (optional)
Multiplexed FASTA/qual files
Per sample FASTA, only for "Full Length Operon"

qiita's People

Contributors

Stargazers

Watchers

qiita's Issues

Benchmark open/close connection handler vs keep it open by instance

see https://github.com/biocore/qiita/pull/70/files#r13720775

error while loading the website

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 1141, in _when_complete
callback()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 1162, in _execute_method
self._when_complete(method(_self.path_args, *_self.path_kwargs),
File "/Users/antoniog/svn_programs/qiita/qiita_pet/handlers/base_handlers.py", line 46, in get
self.render("404.html", user=self.get_current_user())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 538, in render
html = self.render_string(template_name, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 642, in render_string
t = loader.load(template_name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 338, in load
self.templates[name] = self._create_template(name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 366, in _create_template
template = Template(f.read(), name=name, loader=self)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 232, in init
self.code = self._generate_python(loader, compress_whitespace)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 280, in _generate_python
ancestors = self._get_ancestors(loader)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 298, in _get_ancestors
template = loader.load(chunk.name, self.name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 338, in load
self.templates[name] = self._create_template(name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 365, in _create_template
f = open(path, "rb")
IOError: [Errno 2] No such file or directory: '/Users/antoniog/svn_programs/qiita/qiita_pet/templates/base.html'

qiita_db/backends/sql/connections.py should be changed

Currently this module will always execute this line:

postgres = connect(user='defaultuser', database='qiime_md', host='localhost')

This should be changed to only be executed if requested i. e. having something of the form:

class ConnectionManager(object):
     def getConnection():
        # here we should retrieve the credentials from a configuration file or something
        return connect(user='defaultuser', database='qiime_md', host='localhost')

script to check for dependencies and to start all services

Ported from Qiita-pet.

We need to create 1 or 2 scripts to check for all dependencies and to start and perhaps stop all services.
 Daniel McDonald

 wasade commented 24 days ago
Ideally there is a single way to start qiita-pet, such as a foo.run()
method. The constructor for foo could do all the necessary service checks
and bail early. I understand there likely is a motive to support a similar
mechanism to print_qiime_config.py, but encapsulating the functionality
with the code that actually starts qiita-pet should simplify the tests and
the design. Then, something like print_qiitapet_config.py could effectively
just construct foo with, say, "print_service_state=True" or whatnot
…
 Joshua Shorenstein

squirrelo commented 24 days ago
I have an idea on how to do this. I'll include it in the next update I do once the iPython changeover is complete and we know all the dependencies needed.
 josenavas

josenavas commented 24 days ago
Agree with a script that check all the dependencies, but disagree with start all the services. If by services you mean redis, celery ipython, etc this is deployment-dependent. In the README we simply show an example for run a demo, but the actual deployment should be different. For example, the IPython cluster can be started as a single machine, distributed machine, with and without PBS backend. Similarly with Celery.

In my opinion, only the webserver should be started by QiiTa-pet.
 Antonio Gonzalez

 antgonza commented 24 days ago
What about a configuration file with the default sets for a local machine
and another for an Ubuntu EC2 instance, something similar to the qiime
deploy configuration file? I'm just thinking that a "regular" user will
struggle a lot starting all those pieces and what to make it easier. Let me
know if you have other suggestion on how to do this?
 josenavas

josenavas commented 24 days ago
Sounds good
 Joshua Shorenstein

squirrelo commented 24 days ago
Config file listing all the services, all off to begin with but the bare minimum, seems pretty easy.

Different level users

Ported from Qiita-pet

squirrelo commented 23 days ago
admin - access to everything/maintenance stuff
lab member - same as user but can see non-public studies
user - can see only public studies and only run analyses

Also ability to share studies/analyses between users using invitations.

 wasade commented 22 days ago
+1

 ElDeveloper commented 22 days ago
:+1:

Change StudyPerson to have id=email

I will behave like a QiitaUser. Probably useful to create a common base class.

Not needed for the HMP demo

Implement qiita-db person object

Linked to information in study_person. May also be able to make this a base class that the user object builds from since the same base information is in both.

Implement qiita-db analysis object

Implement qiita-db job object

Create an env variable that points to the config file

Update setup.py

This file makes no sense right now, assumes an outdated structure of the project and points to old links.

Add performance test

It would be nice to have performance tests separated from the unit tests.

Add qiita_db delete_test_enviroment

Should get rid of the current test environment. This is useful if running the test suite an a few of the test cases fail.

Open pgcursor in own try-except

Ported from Qiita-pet.

squirrelo commented 23 days ago
if try-except fails on pgcursor creation, the pgcursor.close() will also fail. This needs to be broken out into its own try-except.

wasade commented 22 days ago
Agree. This behavior should be centralized though a common general mechanism otherwise the code will be full of difficult to read replication with try/excepts. This has a further benefit in that it tightens and isolates where database interaction is actually performed.

Try to use ZeroMQ instead of redis

Should test that after the HMP2 demo

Implement qiita-db raw_data object

Implement qiita-db processed_data object

Freezer status metadata

(from Rob) It would be REALLY useful to be able to detect which samples we had dna and/or physical specimen left over in the freezer.

Implement qiita-db investigation object

@squirrelo assigning to you as I already know that you are working on it

Figure out a way of running doctests

Add CLI for adding sample template

Repository is ~150 MB

When #52 was merged, the commit history seems to have been duplicated, when I pulled from upstream/master to my clone, the size of the repo went from 60 MB to 147 MB.

When I clone from upstream/master the repo's size is 146 MB.

Implement qiita-db data object

Will give all information from raw data, preprocessed data, or processed data:
filepaths, params, types, etc

Need tests for #10

We need tests for #10

Add auto-pep8 to travis.yml

I hate PEP8 but I think this is the way to go.

Clean-up the prototypes from the repository

We'll probably need to git rebase and rewrite the history, this will require to rewrite the history of everyone's repositories (git pull -f).

Add tests to the sample_template_adder functionality

Implement qiita-db study object

@squirrelo assigning to you as you are already working on it

SQLConnectionHandler improvements

Two issues I've found:

There is no password passed to the init, so there is no way to give the user password for a database.
Would be nice to have the ability to return SQL query results as dictionaries using the DictCursor setup from psycopg2: http://initd.org/psycopg/docs/extras.html#dictionary-like-cursor

summary/histogram before submission

(from Se Jin and LukeU) Currently, there is no easy way to decide on rarefaction level so add a summary/histogram of seqs/samples before selecting parameters for analyses.

Ability to download beta-diversity text file

Ported from Qiita-pet.

squirrelo commented a month ago
Current Qiime database does not give the beta-diversity distance matrix. Could this be made available in the new release?
 Yoshiki Vázquez Baeza

ElDeveloper commented a month ago
I'm pretty sure the distance matrix is part of the zipped file that comes as a result of a meta-analysis if you decided to compute any beta-diversity metric to plot it in a three dimensional space. But I agree that there should be a direct link to this file instead of just to the 3d plot.

pgsql type json doesn't exist

Ported from Qiita-pet.

antgonza commented 2 months ago
this data type was introduced in 9.2 and the default in ubuntu is 9.1, we should use text or something like that.
 Joshua Shorenstein

squirrelo commented 2 months ago
Welp. Easy enough fix to shove the json in a text data type, but it's kind of annoying not to have the verification abilities.
 Daniel McDonald

 wasade commented 2 months ago
Biof infrastructure is based on rhel and centos, why use ubuntu?
…
 Antonio Gonzalez

 antgonza commented 2 months ago
Currently, I'm testing in the fastest way possible to deploy everything =
EC2. Originally, I thought we will be working with Ubuntu in the Broad but
a few minutes ago we realized that we have CentOS release 5.5 (Final) ...
so versions should be older. Fun times!!!!!
 Daniel McDonald

 wasade commented 2 months ago
Ask Jeff about puppet deploys please
…

Implement qiita-db Sample object

Connection overload causes test failures

Some of the connections are not getting closed and thus overloading the db and causing tests to fail.

delete meta analysis before it finishes

Ported from Qiita-pet.

antgonza commented a month ago
there is no way to stop/kill a meta analysis before the job is finished
 Joshua Shorenstein

squirrelo commented a month ago
This is a celery shortcoming. When we move over to the iPython/torque backend it should be possible.
 Antonio Gonzalez

 antgonza commented a month ago
OK, thanks, this is good to know.
 Daniel McDonald

 wasade commented a month ago
can't revoke?

http://docs.celeryproject.org/en/latest/userguide/workers.html#revoking-tasks
…
 Joshua Shorenstein

squirrelo commented a month ago
Huh, I guess you can revoke, but would need to track the process IDs, most likely in redis somehow. I'll look into it if this is something we want to continue on, otherwise just dump for iPython.

Remove examples section from docstrings

Implement qiita-db preprocessed_data object

Implement qiita-db metadata (aka sample/prep) template object

Job exists test needs to check data used

This is a bit tricky so will be taken care of after the demo.

The easiest way to check this is select out all (processed_data_id, sample_id) pairs from the analysis_sample table using an AND query for the passed information, sorted by analysis_id. If the set of tuples for a singe analysis match the ones passed to the exists function, query out all tuple pairs or that analysis to make sure it isn't just a subset of data. If they still match after that, the job exists and can be linked.

squirrelo commented 23 days ago
Make a webutil.py or some other file in the app folder. Add functions that take care of all sql that is currently in webserver.py

squirrelo commented 22 days ago
As a heads up, @josenavas and I have most of this done in an effort to start the QiiTa-API idea we talked about.

wasade commented 22 days ago
If isolating the API, would it make sense to shift that to QiiTa?

 squirrelo commented 22 days ago
Yup. There are a few new things that came up including something like an API built into QiiTa-DB for accessing the data. That way if users dont need the full scale database setup they wont need to install it.

 wasade commented 22 days ago
I have some API-ish code from december, will send on in a separate email.
May be useful, may be garbage as things have evolved

qiita-spots / qiita Goto Github PK

qiita's Introduction

Qiita (canonically pronounced cheetah)

Current features

Accepted raw files

qiita's People

Contributors

Stargazers

Watchers

Forkers

qiita's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs