<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

As noted in <a class="issue-link js-issue-link" data-error-text="Failed to load title"

The main differences are: Different model archicture<ul dir="

Accurate prediction of single-cell DNA methylation states using deep learning about deep-review HOT 14 CLOSED

cgreene commented on August 20, 2024

Accurate prediction of single-cell DNA methylation states using deep learning

from deep-review.

Comments (14)

cangermueller commented on August 20, 2024 4

Hi guys,

sorry for the late reply, I was on traveling. I am happy to hear that you want to review DeepCpG.

I did not use a 2d convolutional kernel of size C x L to learn dependencies between C cells and L CpG sites, since here the information flow between cells would depend on the ordering of rows (=cells) in the input tensor. Instead, I used a 2d convolution with kernel size 1 x L to only learn dependencies between CpG sites. Dependencies between cells are learnt afterwards by fusion modules, i.e. hidden layers that are connected to the all output neurons of the CpG module and the DNA module. This is the same as scanning the CpG neighbourhood of cells with 1d convolutions, sharing their weights, and connecting the resulting hidden layers. However, this would be slower. Does this make sense?

Concerning point '1. DNA module': Prediction performance only increased slightly by using a window wider than +/- 250bp. As a trade-off between compute costs and performance, I therefore decided to use +/- 250bp.

Concerning point 3. ‘Why we should include it’: I tried to make the model interpretable by

Visualizing DNA motifs (weights of convolutional filters)
Correlating activations of convolutional filters with predicted CpG methylation states
Using learnt DNA motifs to predict cell-to-cell variability
Quantifying the influence of base-pair mutations and neighboring CpG sites by gradient back-propation

Let me know if I can help you with anything else.

Best,
Christof

from deep-review.

cangermueller commented on August 20, 2024 1

Twice the window size means twice as much GPU memory and compute time. The main concern is the memory bottleneck of GPUs. E.g. the cluster I used only had GPUs with 4GBs.

I'll have a look at the entire review.

from deep-review.

gwaybio commented on August 20, 2024

Very well written article predicting binary methylation status (0: hypo, 1: hypermethation) in single cell bisulfite sequencing experiments (scBS-seq). A secondary goal is to visualize the DNA motifs contributing to methylation status and to cellular methylation heterogeneity.

Biology

The authors use scBS-seq data from 32 mouse embryonic stem cells to build their deep network. The features of the network are described in detail and consist of DNA sequence elements and nearby methylation states of the target cells and other experimental cells. Since there is only between 20-40% coverage in scBS-seq experiments because of low DNA yields, models that can impute methylation states in missing regions are extremely important. The authors also show variable predictive performance of their model depending on sequence context of the target CpG (e.g. TSS, Exon, promoter, CpG Island, etc.)

Computational Aspects

There are three deep networks in the model, all of which are convolutional neural networks (CNNs) with one fully connected hidden convolutional layer consisting of max pooling and ReLU activations. Some aspects of the architecture were difficult to decipher (e.g. stride of convolution, feature map size).

DNA module
- Uses sequence elements +/- 250 bp from given CpG
  - The authors did test shorter sequence lengths and report decreased performance
  - It is unclear if larger, or more biologically informed windows would improve performance
- Convolution in 1 dimension - akin to detecting position-specific scoring matrices (PSSM)
CpG module
- Binary methylation state +/- 25 neighboring CpG in target cell and in other assayed cells
- Convolution in 2 dimensions taking into account other cells who may have target CpG measured
Fusion module
- Recieves the CNN output from both the DNA and CpG module
- Fully connected with one output node
  - Sigmoid activation on output layer to predict binary hat{y} = {0, 1}

Model trained with dropout, glorot initialized weights, Adam adaptive learning with early stopping. What is especially nice is the availability of all code used to implement the model.

Why we should include it in our review

Deep learning for epigenetics - I buy this one more than #68
Deep learning in single cells
1. This data is huge and will only continue to grow - one area where deep learning could have a more profound impact
Produce nice interpretations/visualizations (PSSM motifs) for what the DNA module is actually learning in the convolution (with added interpretations of heterogeneity of single cells)
1. One example of overcoming a black box (although blackbox remains for CpG and Fusion module)

I am tagging the first author of the article @cangermueller to make sure I didn't miss anything and/or to add on to this summary.

from deep-review.

agitter commented on August 20, 2024

@gwaygenomics What did they do to create the 2D input for the CpG module if the single cells are initially unordered? Did they create a cell-cell similarity matrix? This relates to the discussion in #79.

from deep-review.

gwaybio commented on August 20, 2024

@agitter Yeah, I stared at this bit for a while - still not sure if I'm understanding correctly. From the supplement:

The methylation state and distance of observed neighbouring CpG sites are inputs to a 2d-convolutional layer. Importantly, this layer convolves each cell separately with the same convolutional filters to unlink the number of model parameters from the number of cells, which can be large.

it looks like the convolutions are only at the single cell level, but weights are shared across cells. This makes more sense since any structure across cells would be artificially imposed.

from deep-review.

agitter commented on August 20, 2024

@gwaygenomics You're right, and that makes a lot more sense. They say:

A 2d-convolution layer convolves the CpG neighbourhood of cells t independently at every position 𝑖 by using filters 𝑤_f of dimension 1 x L x D and length 𝐿

There is still something interesting that they are doing with the distances between neighboring CpG sites that I need to look at further.

from deep-review.

cgreene commented on August 20, 2024

@cangermueller Thank you for providing context for your paper! Regarding point 1, what kind of computational costs would have been required to go to a larger window (say 1k bp)? Are there any practical concerns (e.g. the examples become somewhat more unique with a larger window and thus are more training data required)?

I could easily see some discussion of the computational costs associated with scaling these methods discussed in the review. If you want to pitch in on the full review (via #2 and #88) we'd love to get your perspective.

from deep-review.

agitter commented on August 20, 2024

In the quest for common themes across papers, note that the authors of #24 also wrote that memory was a limiting factor.

@cangermueller if you do decide you want to contribute more, I'd be interested in your thoughts on what topics weren't covered in your recent review #47. We all thought that was an excellent overview and aim to provide a different perspective here, as described in #2 and #88.

from deep-review.

agitter commented on August 20, 2024

As noted in #244, this preprint was updated this month. I haven't checked the differences, but there was mention of updated code at https://github.com/cangermueller/deepcpg/

We may consider highlighting the software as one example of a project that provides good documentation, IPython notebook examples, pre-trained models ("model zoo"), etc.

from deep-review.

cangermueller commented on August 20, 2024

The main differences are:

Different model archicture
- DNA module has two instead of one conv layer
- DNA module operates on 1001bp window instead of 501bp window
- CpG module is bidirectional GRU instead of CNN
Extended evaluation
- Five instead of two cell types, including human and mouse cells
- Comparision scBS-seq vs. scRRBS-seq
- Evaluation predicted mutation effects on known mQTLs
Results
- New model architecture more accurate
- Performance gain highest for scRRBS-seq profiled cells
- Predicted mutation effects higher for known mQTLs

As you noted, I have also refactored the code-base of DeepCpG, provided pre-trained models, and notebooks. However, it is not yet perfect. I am still extending the documentation and notebooks.

Let me know if anything is still unclear!

from deep-review.

cangermueller commented on August 20, 2024

What is not mentioned in the manuscript: Batch-normalization yielded worse results, such that it is not used. I also evaluated a couple of different architecture for the DNA module, including convolutional - recurrent models, ResNets, and dilated convolutionals. However, I quite simple CNN with two conv layer and one FC layer with 128 units performed best.

from deep-review.

agitter commented on August 20, 2024

@cangermueller thanks for updating us here. It sounds like some major improvements.

I really like the runnable examples and effort to make the software reusable.

from deep-review.

agitter commented on August 20, 2024

I edited the original post with the DOI of the published version.

from deep-review.

cangermueller commented on August 20, 2024

Thanks!

from deep-review.

Accurate prediction of single-cell DNA methylation states using deep learning about deep-review HOT 14 CLOSED

Comments (14)

Biology

Computational Aspects

Why we should include it in our review

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs