wormlabcaltech / alaska Goto Github PK

View Code? Open in Web Editor NEW

6.0 4.0 0.0 53.55 MB

Automated and friendly RNA-seq analysis (deprecated)

Home Page: http://alaska.caltech.edu

License: MIT License

R 1.35% Shell 2.18% Python 34.75% Dockerfile 1.08% Perl 4.53% HTML 21.19% JavaScript 20.96% CSS 13.35% PHP 0.61%

rna-seq automation pipeline alaska analysis analysis-pipeline portal quality-control sleuth kallisto

alaska's People

Contributors

Stargazers

Watchers

alaska's Issues

Beta test #1 issues

User must see directory structure modal (shown when "See Examples" button is clicked) before uploading their reads. Make the modal show up when "See FTP details" is clicked. At the bottom, have a button that reads "I have prepared my reads," when clicked shows the FTP details. Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
~~Fix FTP server not accepting passive connections.~~ Fixed on 1/11/2019 by reusing old ftpd_server.
~~Testing buttons show even though "testing=true" is not present in the url.~~ Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
~~Fix issue where sometimes the "Start a new project!" button isn't disabled even though a project id was given through the url.~~ Fixed on 1/12/2019 commit d5acc2b68299960ca86e029aaa76ad64f19ca6f5
~~Make user-expandable list inputs more intuitive to use. When the "Add" button is clicked, add new row for the specified input and reset the textbox.~~ Implemented on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
For custom dropdowns (dropdowns where there is a choice for "other"), make it obvious for the user that they must fill in the textbox when "other" is selected. This may be achieved by either 1) hide the textbox until "other" is selected or 2) put a placeholder value in the textbox that reads "If "other" selected". Implemented on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
~~When inputting all factor values, values that have been removed with the "Remove" button still show up in the factor dropdown of individual samples.~~ Fixed on 1/13/2019 commit 2ba411baf0f656f161e9acfbfebed8cc4ef0e30d
~~Make sample description & factor choices easier to fill out at a larger scale.~~ Implemented on 2/22/2019 commit 3b1f109fb054f57280b8ff4d2aadfd71df37de6d
Implement new isoform annotations into sleuth analysis. Currently, annotations are pulled directly from Ensembl through BioMart. However, BioMart only supports C. elegans genome version 235, which is very outdated compared to the more recent 266. Gene isoform annotations can be found here: ftp://ftp.wormbase.org/pub/wormbase/species/c_elegans/PRJNA13758/annotation/geneIDs/. Implemented on 1/16/2019 commit 4d04a0f4528a5c8402cf0065e93f84df02843509. Alaska no longer fetches annotation data from Ensembl BioMart, but instead uses an annotation flatfile.

Tissue metadata

Current options for tissue are:

Whole-worm (multi-worm)
Whole-worm (single-worm)
Other

Maybe we should expand other to:
single cell
single cell type/Tissue

For cell type/Tissue, we could provide the largest 10 tissues in C. elegans. Raymond can provide advice on this. If no easy fix, then disregard.

For single purified cells, we should probably offer a dropdown menu of all the neurons, all the early embryonic cells (up to the 8 cell stage), and the distal tip cells/linker cell.

Support all Wormbase organisms

Support non-elegans organisms by downloading and placing the appropriate files from wormbase

Implement post_quant_analysis

Implement post_quant_analysis for 1-factor designs.

Scripts for 2-factor designs are not yet implemented but should include:

Summary of DE genes in each factor and interaction
Pairwise correlation plots with Orthogonal Regressions.
Transcriptome-wide epistasis plot
Batesonian Comparison plot

@dangeles

Fix project cleanup

There is an issue where Alaska detects that some projects, though they are still in the object, are stale.
It then removes active projects.

Organism field in metadata

Currently the organism metadata field just says organism. In fact, this should be two fields:

Organism Species: Dropdown menu of scientific names of species
Species Genome: Dropdown menu of the genome versions, ordered in reverse numeric order. I.e., latest genome version first.

Automatically close Sleuth Shiny web app after some time.

Currently, every Sleuth web server opened stays on until the Alaska server is shut down, which is far from ideal. There should be a way to detect when there is no longer activity on the Sleuth server (i.e. detect when the user disconnects) and shut the server down automatically after 'x' minutes/hours of inactivity.
If this is not possible, simply shutting down servers after some time is another option.

Citation at end of analysis

We should provide a text like this along with the analysis results. Citations should be included for everything.

RNA-seq data was analyzed using Alaska with using the (single, two)-factor
design option. Briefly, Alaska performs quality control using BowTie2, etc, etc,
etc... and outputs a summary report generated using MultiQC. Read quantification
and differential expression analyses of transcripts was performed using Kallisto
(v.XXXX) and Sleuth (v.XXXX). Kallisto was run using the following flags:
LINE for Single End reads:
-b 200, -l (input), -standarddeviationflag (input), -bias

Line for PE reads:
-b 200, -bias, -
.
Reads were aligned using (Species) genome version (version) as provided by
Wormbase.

Differential expression analyses with Sleuth were performed using a (LR test or
Wald Test) corrected for multiple-testing.

If species == C. elegans: Enrichment analysis was performed using the
WormBase Enrichment Suite.

If two-factor design: Alaska performed epistasis analyses as first presented in
(cite hypoxia paper).

Sleuth server stopped working

When "open sleuth server" button is clicked on the analysis result page, the sleuth server can not be connected. Currently have no idea why it stopped working... Will have to work on debugging. May be an issue with port forwarding on Docker?

Give visual feedback when metadata is saved

Provide the users with some kind of visual feedback when they press "Save and apply changes" at each of the metadata cards.

Add a Lab identifier?

WormBase has PI identifiers. Maybe we should add this as an optional field?

Lab Identifier: Select your lab identifier from the dropdown. If your lab does not have an identifier, make one (link to page) or leave this blank.

Issue server commands via the terminal the server is running on.

This would be more convenient for the purposes of testing/managing the server (from the server host), rather than running Request.sh every time on a separate terminal.

Modify Sleuth script to match David's

Modify sleuth.R script to match the analysis of diff_exp_analyzer.r.
This may fix some plots not showing up on the shiny web server.

Add additional data information in SOFT file

An example is here https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1561394.

Also the SOFT file specification for RNA-seq here https://www.ncbi.nlm.nih.gov/geo/info/soft-seq.html and information about processed data files here https://www.ncbi.nlm.nih.gov/geo/info/seq.html#processed.

Specifically, we need to add the following labels:
!Sample_data_processing - Description of the data processing steps & software (including versions)
!Sample_supplementary_file - One for every processed data file
!Sample_processed_data_files_format_and_content - One for every processed data file, in tandem with !Sample_supplementary_file, describing the format and content of the file. Unclear how detailed it needs to be. @dangeles?

Life-stages to choose from

Life-stages to choose from in the metadata should include:

**** indicate that Raymond should approve these, since some of these life-stages may have different annotations in wormbase.

Single-cell Embryo ***
2-cell Embryo ***
4-cell Embryo ***
Embryo
L1
L1 arrest
L2
L2d
Dauer
L3
L4
Young Adult
Adult
Post-egg-laying adult (unmated) ****
Post-egg-laying adult (mated) ****
Aged Adult (Animals >7 days old) ****

Automatically load most recent server save on start.

When the server is started with Start.sh, automatically scan the saves directory and load the most recent save.

Hover descriptions in experimental design

The 1-factor description currently reads:

A 1-factor design contrasts a control sample with a single experimental sample.

However, I'm hearing from beta users this is confusing, and I agree. We should modify this to read:

A 1-factor design finds the differentially expressed transcripts between an experimentally perturbed sample (for example, a mutant strain) and a reference sample (often the wild-type strain). This is the most common experimental design for RNA-seq.

~~Change "Misc. characteristics" to just "Misc" so that it is clear what the form is asking for is related to the previous input fields.~~ Implemented in commit 0a3678aa1960f15e22ecf87c08edabdcf407b211
~~Add a description blurb above the samples metadata form, so that it is clear what the user has to do for each sample. (i.e. they have to select each sample and fill out the form).~~ Implemented in commit 0a3678aa1960f15e22ecf87c08edabdcf407b211
~~Change "Control value" to "control" in project controls form.~~ Implemented in commit 0a3678aa1960f15e22ecf87c08edabdcf407b211
~~Email: make sure people know the email is from Alaska -- change sender email to something more noticeable ([email protected]?)~~ Implemented in commit 15d3f79dc62e1b9c93330daa10b7558b6415acca
When the user clicks "open sleuth server," for the server window to show up properly the browser must have popups enabled. Show a popup stating the user must have popup enabled for the window to display properly. Implemented in commit 88f54c0b74a83749746845769874a56ce2363fe7
The compression used by Alaska to compress project folders need to be changed to be compatible with MacOS. (An issue with the compression methods of bsdtar and gnutar, as indicated at https://apple.stackexchange.com/questions/197839/why-is-extracting-this-tgz-throwing-an-error-on-my-mac-but-not-on-linux. Fixed in commit 15d3f79dc62e1b9c93330daa10b7558b6415acca

Sleuth.R does not include required libraries

Library inclusions were accidentally removed in Sleuth.R on commit 3b66f0f.
Sleuth.R should include these libraries: sleuth, optparse and files.

PRJEB28388 and paired end errors

Submitted from the feedback form on the WormBase website.

We are using Alaska software to analyse our RNAseq data. We have two bugs to report:

Two of the C. elegans reference genomes give errors. When using the ref genome PRJEB28388, our analysis at failed at step 3 (differential expression analysis. When using ref genome PRJNA275000, analysis failed at step 2 (alignment and quantification). Our analysis was successful when using ref genome PRJNA13758.
When inputting our reads as single end, our analysis was successful. When inputting our reads (in the same folders) as paired end, Alaska asked us to select our read pairs in the sample-specific metadata form. Instead of giving us the option to pair for example 'sample_1_read_1' with 'sample_1_read_2' (which showed up on the form), it only gave us the option to pair with the identical file i.e. 'sample_1_read_1' with 'sample_1_read_1'.
In the meantime, we are analyzing our data as single reads, but would appreciate some help with inputting up our data as paired reads.
Thank you for any help you can provide!

wormlabcaltech / alaska Goto Github PK

alaska's People

Contributors

Stargazers

Watchers

alaska's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs