GithubHelp home page GithubHelp logo

Comments (1)

DaneseAnna avatar DaneseAnna commented on August 12, 2024

Hi,

This is a very interesting and complex question. I will try to give an answer/discussion.

Let me rephrase your questions in smaller ones.
How to distinguish actual cell types with very little open chromatin from low quality barcodes (they may not even be cellsI am not even calling them cells in this case).
How to identify these low-depth cell types when the first components seems to be library size

Technical low depth barcodes/cells can be caused by 2 things:
how many reads were sequenced from the library
how many insertions the transposase actually managed to do in a cell.
Biological low depth can be due to:
the very close chromatin state of the cell type
a cell type/nucleus that doesnโ€™t resist the protocol or that can be harder to integrate for the transposase

About a):
All cells, independently of cell type, have some regions that should systematically be open (RNApol2, other ubiquitous genes). So we can expect a minimum number of insertion per cell.
This can be explored by looking at the QC and checking TSS enrichment at house keeping genes, for example.

About b):

Usually, if you are using peaks you will identify peaks from highly covered cells so you will lose the low-depth as noisy (having more than x percent of their reads outside of peaks).You can try to use an annotation based feature space to try to keep some of the biological signal.

You can also focus on the lowly covered cell and try to use a different feature space, like promoter regions or small windows to see if there are some regions enriched in the lowly covered cells that might be cell type specific. Once you have done that you can decide on a feature space containing the regions that are cell type specific and look at all the cells together.

So, to some extent you can salvage the low-depth cells from the technicaly lowly covered cells. However, you will still have the library size effect. It is a big technical artefact and it is not disappearing despite excluding PC1 and/or oding library size correction.

You can check the relationship between library size (or any other technical artifact) and the PC components using the function correlation_pc. This is very useful to identify artifacts in the data. However, we would not recommend to remove the first four PCs, as you will be removing a lot of the biological variation present in the data like that (as you can see that library size is mainly correlated with PC1; to check how much library size explains the other PCs you can use correlation_pc).

from episcanpy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.