GithubHelp home page GithubHelp logo

wormlabcaltech / alaska Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 0.0 53.55 MB

Automated and friendly RNA-seq analysis (deprecated)

Home Page: http://alaska.caltech.edu

License: MIT License

R 1.35% Shell 2.18% Python 34.75% Dockerfile 1.08% Perl 4.53% HTML 21.19% JavaScript 20.96% CSS 13.35% PHP 0.61%
rna-seq automation pipeline alaska analysis analysis-pipeline portal quality-control sleuth kallisto

alaska's People

Contributors

dangeles avatar lioscro avatar raymond91125 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

alaska's Issues

Beta test #1 issues

  • User must see directory structure modal (shown when "See Examples" button is clicked) before uploading their reads. Make the modal show up when "See FTP details" is clicked. At the bottom, have a button that reads "I have prepared my reads," when clicked shows the FTP details. Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
  • Fix FTP server not accepting passive connections. Fixed on 1/11/2019 by reusing old ftpd_server.
  • Testing buttons show even though "testing=true" is not present in the url. Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
  • Fix issue where sometimes the "Start a new project!" button isn't disabled even though a project id was given through the url. Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
  • Make user-expandable list inputs more intuitive to use. When the "Add" button is clicked, add new row for the specified input and reset the textbox. Implemented on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
  • For custom dropdowns (dropdowns where there is a choice for "other"), make it obvious for the user that they must fill in the textbox when "other" is selected. This may be achieved by either 1) hide the textbox until "other" is selected or 2) put a placeholder value in the textbox that reads "If "other" selected". Implemented on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
  • When inputting all factor values, values that have been removed with the "Remove" button still show up in the factor dropdown of individual samples. Fixed on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
  • Make sample description & factor choices easier to fill out at a larger scale. Implemented on 2/22/2019 commit 3b1f109fb054f57280b8ff4d2aadfd71df37de6d
  • Implement new isoform annotations into sleuth analysis. Currently, annotations are pulled directly from Ensembl through BioMart. However, BioMart only supports C. elegans genome version 235, which is very outdated compared to the more recent 266. Gene isoform annotations can be found here: ftp://ftp.wormbase.org/pub/wormbase/species/c_elegans/PRJNA13758/annotation/geneIDs/. Implemented on 1/16/2019 commit 4d04a0f4528a5c8402cf0065e93f84df02843509. Alaska no longer fetches annotation data from Ensembl BioMart, but instead uses an annotation flatfile.

Tissue metadata

Current options for tissue are:

Whole-worm (multi-worm)
Whole-worm (single-worm)
Other

Maybe we should expand other to:
single cell
single cell type/Tissue

For cell type/Tissue, we could provide the largest 10 tissues in C. elegans. Raymond can provide advice on this. If no easy fix, then disregard.

For single purified cells, we should probably offer a dropdown menu of all the neurons, all the early embryonic cells (up to the 8 cell stage), and the distal tip cells/linker cell.

Implement post_quant_analysis

Implement post_quant_analysis for 1-factor designs.

Scripts for 2-factor designs are not yet implemented but should include:

  1. Summary of DE genes in each factor and interaction
  2. Pairwise correlation plots with Orthogonal Regressions.
  3. Transcriptome-wide epistasis plot
  4. Batesonian Comparison plot

@dangeles

Fix project cleanup

There is an issue where Alaska detects that some projects, though they are still in the object, are stale.
It then removes active projects.

Organism field in metadata

Currently the organism metadata field just says organism. In fact, this should be two fields:

Organism Species: Dropdown menu of scientific names of species
Species Genome: Dropdown menu of the genome versions, ordered in reverse numeric order. I.e., latest genome version first.

Automatically close Sleuth Shiny web app after some time.

Currently, every Sleuth web server opened stays on until the Alaska server is shut down, which is far from ideal. There should be a way to detect when there is no longer activity on the Sleuth server (i.e. detect when the user disconnects) and shut the server down automatically after 'x' minutes/hours of inactivity.
If this is not possible, simply shutting down servers after some time is another option.

Citation at end of analysis

We should provide a text like this along with the analysis results. Citations should be included for everything.

RNA-seq data was analyzed using Alaska with using the (single, two)-factor
design option. Briefly, Alaska performs quality control using BowTie2, etc, etc,
etc... and outputs a summary report generated using MultiQC. Read quantification
and differential expression analyses of transcripts was performed using Kallisto
(v.XXXX) and Sleuth (v.XXXX). Kallisto was run using the following flags:
LINE for Single End reads:
-b 200, -l (input), -standarddeviationflag (input), -bias

Line for PE reads:
-b 200, -bias, -
.
Reads were aligned using (Species) genome version (version) as provided by
Wormbase.

Differential expression analyses with Sleuth were performed using a (LR test or
Wald Test) corrected for multiple-testing.

If species == C. elegans: Enrichment analysis was performed using the
WormBase Enrichment Suite.

If two-factor design: Alaska performed epistasis analyses as first presented in
(cite hypoxia paper).

Sleuth server stopped working

When "open sleuth server" button is clicked on the analysis result page, the sleuth server can not be connected. Currently have no idea why it stopped working... Will have to work on debugging. May be an issue with port forwarding on Docker?

Add a Lab identifier?

WormBase has PI identifiers. Maybe we should add this as an optional field?

Lab Identifier: Select your lab identifier from the dropdown. If your lab does not have an identifier, make one (link to page) or leave this blank.

Add additional data information in SOFT file

An example is here https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1561394.

Also the SOFT file specification for RNA-seq here https://www.ncbi.nlm.nih.gov/geo/info/soft-seq.html and information about processed data files here https://www.ncbi.nlm.nih.gov/geo/info/seq.html#processed.

Specifically, we need to add the following labels:
!Sample_data_processing - Description of the data processing steps & software (including versions)
!Sample_supplementary_file - One for every processed data file
!Sample_processed_data_files_format_and_content - One for every processed data file, in tandem with !Sample_supplementary_file, describing the format and content of the file. Unclear how detailed it needs to be. @dangeles?

Life-stages to choose from

Life-stages to choose from in the metadata should include:

**** indicate that Raymond should approve these, since some of these life-stages may have different annotations in wormbase.

Single-cell Embryo ***
2-cell Embryo ***
4-cell Embryo ***
Embryo
L1
L1 arrest
L2
L2d
Dauer
L3
L4
Young Adult
Adult
Post-egg-laying adult (unmated) ****
Post-egg-laying adult (mated) ****
Aged Adult (Animals >7 days old) ****

Hover descriptions in experimental design

The 1-factor description currently reads:

A 1-factor design contrasts a control sample with a single experimental sample.

However, I'm hearing from beta users this is confusing, and I agree. We should modify this to read:

A 1-factor design finds the differentially expressed transcripts between an experimentally perturbed sample (for example, a mutant strain) and a reference sample (often the wild-type strain). This is the most common experimental design for RNA-seq.

Misc. characteristics

No description is given for what a misc. characteristic is. An example would be helpful.

Beta test #2 Issues

PRJEB28388 and paired end errors

Submitted from the feedback form on the WormBase website.

We are using Alaska software to analyse our RNAseq data. We have two bugs to report:

  1. Two of the C. elegans reference genomes give errors. When using the ref genome PRJEB28388, our analysis at failed at step 3 (differential expression analysis. When using ref genome PRJNA275000, analysis failed at step 2 (alignment and quantification). Our analysis was successful when using ref genome PRJNA13758.
  2. When inputting our reads as single end, our analysis was successful. When inputting our reads (in the same folders) as paired end, Alaska asked us to select our read pairs in the sample-specific metadata form. Instead of giving us the option to pair for example 'sample_1_read_1' with 'sample_1_read_2' (which showed up on the form), it only gave us the option to pair with the identical file i.e. 'sample_1_read_1' with 'sample_1_read_1'.
    In the meantime, we are analyzing our data as single reads, but would appreciate some help with inputting up our data as paired reads.
    Thank you for any help you can provide!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.