GithubHelp home page GithubHelp logo

hemberg-lab / scrna.seq.course Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rstudio/bookdown-demo

670.0 670.0 360.0 373.55 MB

Analysis of single cell RNA-seq data course

Home Page: https://www.singlecellcourse.org

License: GNU General Public License v3.0

TeX 50.07% CSS 6.30% HTML 3.59% Python 6.10% Perl 13.25% R 6.82% Dockerfile 10.91% Nextflow 2.38% Shell 0.58%
bookdown course r scrna-seq scrna-seq-analysis sequencing

scrna.seq.course's People

Contributors

apredeus avatar chuliangxiao avatar davismcc avatar flying-sheep avatar nyaapass avatar prete avatar simondmurray avatar stephaniehicks avatar tallulandrews avatar tavareshugo avatar wikiselev avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scrna.seq.course's Issues

PDF file of the course

Add texlive-full to the docker image, but it takes more than 2 hours to build, so won't be able to build it on DockerHub. So, we should probably move docker image building to Quay... And then we can put everything on the https://dockstore.org/

obscure normalization results

Hi,
I am using some normalization method to normalize my data set, which is originally from 10X genomics, I managed it using SingleCellExperiment class using scater package;
after filtering with MT, total counts, total features, I did normalization as told in course, however, when I plot RLE, I got these strange results, which seems to be more 'normalized' in Raw than normalized with methods, shown below:
scranNorm.pdf

TMMnorm.pdf
UQnorm.pdf
RLEnorm.pdf
CPMnorm.pdf

for me, TMM seems to be good, but not others, especially for scran;
any explanation for this?
Thanks!

Remove SNN-cliq

We should consider removing SNN-cliq and replace with a more recent clustering method

docker image link not correct

Just had an issue trying to connect to the docker image listed in section 2.2--it doesn't match the (seemingly correct) link in section 1.4

Could you please provide some intermediate files so people like me can go through the whole classes?

I was wondering whether you could provide the intermediate files used as examples in this amazing tutorial.

I downloaded some files, such as "ERR522959_2.fastq", "ERR522959_1.fastq", to follow along the class, but couldn't find the files like "data/10cells_read1.fq, data/10cells_read2.fq", "data/droplet_id_example_per_barcode.txt.gz", "data/droplet_id_example_truth.gz".

I had experience in analyzing bulk RNAseq data, and is not currently working on ScRNAseq data. For some reasons, it is impractical for me to download all the original fastq files to our server to generate all the intermediate files. What I hope to do is to be able to follow along your classes on my personal Mac computer to get familiar with all the tools and the workflow of ScRNAseq analysis.

So I would greatly appreciate your kindness to provide a link (either here or some other sources) to download the intermediate files required to work through the examples in this class.

Best,

Jeff

Molecules.txt from tung data

I am trying to follow along this workshop, but I cannot find the tung data (molecules.txt) file anywhere. Could someone please upload this data?

CPM issue

A reply from one user:

I’ve been following the videos and materials for the scRNA-seq course that you ran last year, and have found them very helpful, thanks to you and the rest of the organisers for providing this resource, and hope to be able to attend in person in future!

Just one potential bug that I have come across so far that I thought you might like to have a look at - in 15.2.1 RUVg (and repeated in 15.2.2) - I think this code is missing “* 1000000", as it seems to produce counts per “one" rather than counts per million.

set_exprs(umi.qc, "ruvg2_logcpm") <- log2(t(t(ruvg$normalizedCounts) / 
                                           colSums(ruvg$normalizedCounts)) + 1)

I only noticed this as I tried to run a t-SNE of the resulting ruvg2_logcpm expression data, and it produced a 'perplexity too high' error, which was not correctable by manually adjust the perplexity. This led me to look at the expression values and notice they were all very low by a factor of 10E6. I'm not sure why this produced the error that it did (as the matrix dimensions were not affected?), but it clearly didn't like the low expression values as the t-SNE ran without error after I multiplied by 1000000 to produce what look like true CPM. plotPCA ran happily with the original very low expression values.

Perhaps this has already been noted, or is only an issue for my data, but thought worth pointing out in any case!

Sort out normalisation chapters

In the latest version of scater normaliseExprs does not write to norm_counts slot, but instead to exprs slot. Normalisation chapter will have to be updated.

"trim_galore" command could not be found

I am trying to follow the tutorials in RStudio. In "process-raw-QC.Rmd", when I run the following command, it said that trim_galore could not be found. I redownloaded the image, but still got the problem. Could you please let me know how to solve this? Should I try earlier version of this class? Thank you!

`

trim_galore -h

`

/tmp/RtmpgcO0S6/chunk-code-2e2786f67e.txt: line 1: trim_galore: command not found

Jeff

Missing files. The latest added part "Processing RAW scRNA-seq data" has no raw data in git.

Hello, dear professors, thanks for your great course! Recently, you seem to update the course and add the important part, processing the raw data. However, there is no share data folder in this git repository. I want to follow your course task and I need those files. It's more convenient for us who studying this course to directly use the data instead of finding them in the paper. And by the way, the paper you quoted "Kolodziejczyk, Aleksandra A., Jong Kyoung Kim, Valentine Svensson, John C. Marioni, and Sarah A. Teichmann. 2015. “The Technology and Biology of Single-Cell RNA Sequencing.” Molecular Cell 58 (4). Elsevier BV: 610–20. doi:10.1016/j.molcel.2015.04.005" seems no data providing. It might be another Deng's paper. However, in the SRA database, there as so much raw data of this paper that we cannot find the data used in your course, that is supposed to be saved in a "share" folder.

Simulated vs Real data vs Integrated RNA seq data analysis

#1 What are the changes required if using sc-RNA seq count data simulated using popular methods like splatter compared to real data?
#2 Methods to integrate different sc-RNA seq data sets?
#3 Changes in analysis workflow when integrating sc-RNA seq data sets with each other and with other sc-omics data like ATAC-seq data?

detect outlier in plotPCA

Hi, I copy the code in http://hemberg-lab.github.io/scRNA.seq.course/cleaning-the-expression-matrix.html#exprs-qc and try plotPCA for automatic cell filtering. I put the code "assay(umi, "logcounts") <- log2(counts(umi) + 1)" before running plotPCA and the warning message is like this

Warning messages:
1: In .disambiguate_args(...) :
non-plotting arguments like 'pca_data_input' should go in 'run_args'
2: 'return_SCE=TRUE' is deprecated, use 'runPCA' instead

And the dots in the PCAplot were not colored according to outliers or not.
My R version is 3.5 and scater package version is 1.8.0.

Thanks!

Jenkins stuff to remember

If you need to check the build console output or start a new build from outside Sanger.

  1. Use VPN and ssh to the instance with Jenkins, then

Last build console output

/var/lib/jenkins/jobs/PROJECT_NAME/builds/lastSuccessfulBuild/log

Start a new build

First create a crumb

CRUMB=$(curl -s 'http://USERNAME:PASSWORD@localhost:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,":",//crumb)')

Then start a new build using the crumb

curl -X POST -u USERNAME -H "$CRUMB" localhost:8080/job/sc-course/build

How to run browser in Docker?

Hi,

I am totally unfamiliar with Docker.

After installing Docker tool (under Win8), I run:
docker run -d -p 8787:8787 quay.io/hemberg-group/scrna-seq-course-rstudio, and it works.

Then how could I visit localhost:8787 in the browser? I open the browser and type localhost:8787, but it did not show the log in interface. Would you mind making this step more clear? (especially under Win8 environment)

Could I do alignment under this Docker image in the future using just a laptop?

Sorry for these silly questions, I just begin learning to preprocess raw fastq data.

Thanks a lot!

Wenhao

STAR_explanation.png is missing

thanks for this great course.
In 3.5 the image for the STAR aligner is missing.
please either remove the link to it, or add the image again

thanks

New things

  1. Imputation chapter
  2. Separate chapter for SEURAT
  3. Add more better batch correction methods
    (optional) 4. Cross-dataset comparison (scmap, metaneighbour, mnnCorrect, CCA-part of Seurat)

Notes from course 31 Oct 2017

  • Make script for figure 3.2 (unique map, multi-map, unmapped) & include discussion of what if too many cells to visualize? -> fit linear or exponential distribution & find outliers
  • Discuss Droplets-> size proportional to wait time = exponential distribution -> ~exponential distribution of library sizes.

Fig 5: which script was used?

Hi,
I wonder if you already have script available for the following figure?

Figure 5.2: Example of the total number of reads mapped to each cell.
Thanks a lot,
Shuoguo

Notes from Berlin

  1. spend more time on last 3 days and less on processing raw reads
    differential isoform analysis?
    minimum # reads per cell
    umi-counting with kallisto how to aggregate umis from transcripts to genes
    Practicals to add: slalom, RNA velocity, dropEST, UMI-tools, BEARscc, Libinorm, Cell-cycle analysis
    Drop-seq pipeline
    imputation: when is zero really a zero?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.