GithubHelp home page GithubHelp logo

qiita-spots / qiita Goto Github PK

View Code? Open in Web Editor NEW
118.0 21.0 79.0 46.28 MB

Qiita - A multi-omics databasing effort

Home Page: http://qiita.microbio.me

License: BSD 3-Clause "New" or "Revised" License

Python 47.07% HTML 17.22% JavaScript 1.66% Shell 0.05% CSS 0.04% Makefile 0.12% PLpgSQL 1.52% Jupyter Notebook 32.31%
microbiome microbiome-analysis python

qiita's Introduction

Qiita (canonically pronounced cheetah)

Build Status Coverage Status

Advances in sequencing, proteomics, transcriptomics, metabolomics, and others are giving us new insights into the microbial world and dramatically improving our ability to understand their community composition and function at high resolution. These new technologies are generating vast amounts of data, even from a single study or sample, leading to challenges in storage, representation, analysis, and integration of the disparate data types. Qiita was designed to allow users address these new challenges by keeping track of multiple studies with multiple 'omics data. Additionally, Qiita is capable of supporting multiple analytical pipelines through a 3rd-party plugin system, allowing the user to have a single entry point for all their analyses. Qiita's main site provides database and compute resources to the global community, alleviating the technical burdens, such as familiarity with the command line or access to compute power, that are typically limiting for researchers studying microbial ecology.

Qiita is currently in production/stable status. We are very open to community contributions and feedback. If you're interested in contributing to Qiita, see CONTRIBUTING.md. If you'd like to report bugs or request features, you can do that in the Qiita issue tracker.

To install and configure your own Qiita server, see INSTALL.md. However, Qiita is not designed to be used locally but rather on a server, we therefore advise against installing your own version on a personal computer. Nevertheless, it can run just fine on a laptop or small computer for development and educational purposes. For example, for every single PR and release, we install Qiita from scratch as GitHub Actions, you can follow these steps.

For more specific details about Qiita's philosophy and design visit the Qiita main site tutorial.

Current features

  • Full study management: Create, delete, update samples in the sample and multiple preparation information files.
  • Upload files via direct drag & drop from the web interface or via scp from any server that allows these connections.
  • Study privacy management: Sandboxed -> Private -> Public.
  • Easy long-term sequence data deposition to the European Nucleotide Archive (ENA), part of the European Bioinformatics Institute (EBI) for private and public studies.
  • Raw data processing for Target Gene, Metagenomic, Metabolomic, Genome Isolates and BIOM files. NOTE: BIOM files can be added as new preparation files for downstream analyses; however, this cannot be made public in the system.
  • Basic downstream analyses using QIIME 2. Note that Qiita produces qza/qzv in the analytical steps but you can also convert non QIIME 2 artifacts.
  • Bulk download of studies and artifacts.
  • Basic study search in the study listing page.
  • Complex metadata search via redbiom.

For more detailed information visit the Qiita tutorial and the Qiita help.

Accepted raw files

  • Multiplexed SFF
  • Multiplexed FASTQ: forward, reverse (optional), and barcodes
  • Per sample FASTQ: forward and reverse (optional)
  • Multiplexed FASTA/qual files
  • Per sample FASTA, only for "Full Length Operon"

qiita's People

Contributors

adamrp avatar adswafford avatar amandabirmingham avatar antgonza avatar catfish47 avatar charles-cowart avatar colinbrislawn avatar eldeveloper avatar gregcaporaso avatar hannesholste avatar jenwei avatar jorge-c avatar josenavas avatar justinshaffer avatar jwdebelius avatar mdehollander avatar mestaki avatar mivamo1214 avatar mortonjt avatar qiyunzhu avatar rnaer avatar sarayupai avatar sjanssen2 avatar squirrelo avatar stephanieorch avatar tanaes avatar teravest avatar wasade avatar wdwvt1 avatar yotohoshi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qiita's Issues

error while loading the website

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 1141, in _when_complete
callback()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 1162, in _execute_method
self._when_complete(method(_self.path_args, *_self.path_kwargs),
File "/Users/antoniog/svn_programs/qiita/qiita_pet/handlers/base_handlers.py", line 46, in get
self.render("404.html", user=self.get_current_user())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 538, in render
html = self.render_string(template_name, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py", line 642, in render_string
t = loader.load(template_name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 338, in load
self.templates[name] = self._create_template(name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 366, in _create_template
template = Template(f.read(), name=name, loader=self)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 232, in init
self.code = self._generate_python(loader, compress_whitespace)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 280, in _generate_python
ancestors = self._get_ancestors(loader)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 298, in _get_ancestors
template = loader.load(chunk.name, self.name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 338, in load
self.templates[name] = self._create_template(name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/template.py", line 365, in _create_template
f = open(path, "rb")
IOError: [Errno 2] No such file or directory: '/Users/antoniog/svn_programs/qiita/qiita_pet/templates/base.html'

qiita_db/backends/sql/connections.py should be changed

Currently this module will always execute this line:

postgres = connect(user='defaultuser', database='qiime_md', host='localhost')

This should be changed to only be executed if requested i. e. having something of the form:

class ConnectionManager(object):
     def getConnection():
        # here we should retrieve the credentials from a configuration file or something
        return connect(user='defaultuser', database='qiime_md', host='localhost')

script to check for dependencies and to start all services

Ported from Qiita-pet.

We need to create 1 or 2 scripts to check for all dependencies and to start and perhaps stop all services.
 Daniel McDonald

 wasade commented 24 days ago
Ideally there is a single way to start qiita-pet, such as a foo.run()
method. The constructor for foo could do all the necessary service checks
and bail early. I understand there likely is a motive to support a similar
mechanism to print_qiime_config.py, but encapsulating the functionality
with the code that actually starts qiita-pet should simplify the tests and
the design. Then, something like print_qiitapet_config.py could effectively
just construct foo with, say, "print_service_state=True" or whatnot
…
 Joshua Shorenstein

squirrelo commented 24 days ago
I have an idea on how to do this. I'll include it in the next update I do once the iPython changeover is complete and we know all the dependencies needed.
 josenavas

josenavas commented 24 days ago
Agree with a script that check all the dependencies, but disagree with start all the services. If by services you mean redis, celery ipython, etc this is deployment-dependent. In the README we simply show an example for run a demo, but the actual deployment should be different. For example, the IPython cluster can be started as a single machine, distributed machine, with and without PBS backend. Similarly with Celery.

In my opinion, only the webserver should be started by QiiTa-pet.
 Antonio Gonzalez

 antgonza commented 24 days ago
What about a configuration file with the default sets for a local machine
and another for an Ubuntu EC2 instance, something similar to the qiime
deploy configuration file? I'm just thinking that a "regular" user will
struggle a lot starting all those pieces and what to make it easier. Let me
know if you have other suggestion on how to do this?
 josenavas

josenavas commented 24 days ago
Sounds good
 Joshua Shorenstein

squirrelo commented 24 days ago
Config file listing all the services, all off to begin with but the bare minimum, seems pretty easy.

Different level users

Ported from Qiita-pet

squirrelo commented 23 days ago
admin - access to everything/maintenance stuff
lab member - same as user but can see non-public studies
user - can see only public studies and only run analyses

Also ability to share studies/analyses between users using invitations.

 wasade commented 22 days ago
+1

 ElDeveloper commented 22 days ago
:+1:

Implement qiita-db person object

Linked to information in study_person. May also be able to make this a base class that the user object builds from since the same base information is in both.

Update setup.py

This file makes no sense right now, assumes an outdated structure of the project and points to old links.

Open pgcursor in own try-except

Ported from Qiita-pet.

squirrelo commented 23 days ago
if try-except fails on pgcursor creation, the pgcursor.close() will also fail. This needs to be broken out into its own try-except.

wasade commented 22 days ago
Agree. This behavior should be centralized though a common general mechanism otherwise the code will be full of difficult to read replication with try/excepts. This has a further benefit in that it tightens and isolates where database interaction is actually performed.

Freezer status metadata

(from Rob) It would be REALLY useful to be able to detect which samples we had dna and/or physical specimen left over in the freezer.

summary/histogram before submission

(from Se Jin and LukeU) Currently, there is no easy way to decide on rarefaction level so add a summary/histogram of seqs/samples before selecting parameters for analyses.

Ability to download beta-diversity text file

Ported from Qiita-pet.

squirrelo commented a month ago
Current Qiime database does not give the beta-diversity distance matrix. Could this be made available in the new release?
 Yoshiki Vázquez Baeza

ElDeveloper commented a month ago
I'm pretty sure the distance matrix is part of the zipped file that comes as a result of a meta-analysis if you decided to compute any beta-diversity metric to plot it in a three dimensional space. But I agree that there should be a direct link to this file instead of just to the 3d plot.

pgsql type json doesn't exist

Ported from Qiita-pet.

antgonza commented 2 months ago
this data type was introduced in 9.2 and the default in ubuntu is 9.1, we should use text or something like that.
 Joshua Shorenstein

squirrelo commented 2 months ago
Welp. Easy enough fix to shove the json in a text data type, but it's kind of annoying not to have the verification abilities.
 Daniel McDonald

 wasade commented 2 months ago
Biof infrastructure is based on rhel and centos, why use ubuntu?
…
 Antonio Gonzalez

 antgonza commented 2 months ago
Currently, I'm testing in the fastest way possible to deploy everything =
EC2. Originally, I thought we will be working with Ubuntu in the Broad but
a few minutes ago we realized that we have CentOS release 5.5 (Final) ...
so versions should be older. Fun times!!!!!
 Daniel McDonald

 wasade commented 2 months ago
Ask Jeff about puppet deploys please
…

delete meta analysis before it finishes

Ported from Qiita-pet.

antgonza commented a month ago
there is no way to stop/kill a meta analysis before the job is finished
 Joshua Shorenstein

squirrelo commented a month ago
This is a celery shortcoming. When we move over to the iPython/torque backend it should be possible.
 Antonio Gonzalez

 antgonza commented a month ago
OK, thanks, this is good to know.
 Daniel McDonald

 wasade commented a month ago
can't revoke?

http://docs.celeryproject.org/en/latest/userguide/workers.html#revoking-tasks
…
 Joshua Shorenstein

squirrelo commented a month ago
Huh, I guess you can revoke, but would need to track the process IDs, most likely in redis somehow. I'll look into it if this is something we want to continue on, otherwise just dump for iPython.

Job exists test needs to check data used

This is a bit tricky so will be taken care of after the demo.

The easiest way to check this is select out all (processed_data_id, sample_id) pairs from the analysis_sample table using an AND query for the passed information, sorted by analysis_id. If the set of tuples for a singe analysis match the ones passed to the exists function, query out all tuple pairs or that analysis to make sure it isn't just a subset of data. If they still match after that, the job exists and can be linked.

Do we want unique sample ids across all studies?

Currently the DB has a unique constraint on the sample id. That means that the sample ids should be unique across all studies. Is this something desired?

Note: no need an answer prior to the HMP2 demo, but creating the issue so we can keep track of it!

Create test DB for travis

Creates the database structure and populates it before running the tests, so we already have some test data to execute the tests.

Database population issues

  • Default user_level_id is 5 (unconfirmed email address)
  • Implement bcrypt hashing for passwords
  • Setting for single user to default to "qiita" user (user_level_id of 1 super_user)
  • PANDAS!
  • American gut tables port
  • add tables for study_status, study person table (lab_person_id, principal_investigator_id, emp_person), timeseries_type
  • Standardize what first_contact and most_recent contact hold (date or person contacted?) with comment in database.
  • Comment on file type table that it stores file types allowed (fasta, fastq, etc)
  • Investigation needs name
  • Insert default values for user_level, analysis_status, and all other vocab restriction tables.
  • Remove salt column from qiita_users

Remove SQL from webserver.py

Ported from Qiita-pet.

squirrelo commented 23 days ago
Make a webutil.py or some other file in the app folder. Add functions that take care of all sql that is currently in webserver.py

squirrelo commented 22 days ago
As a heads up, @josenavas and I have most of this done in an effort to start the QiiTa-API idea we talked about.

wasade commented 22 days ago
If isolating the API, would it make sense to shift that to QiiTa?

 squirrelo commented 22 days ago
Yup. There are a few new things that came up including something like an API built into QiiTa-DB for accessing the data. That way if users dont need the full scale database setup they wont need to install it.

 wasade commented 22 days ago
I have some API-ish code from december, will send on in a separate email.
May be useful, may be garbage as things have evolved

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.