GithubHelp home page GithubHelp logo

aedera / deepredmt Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 2.0 32.67 MB

Deepred-Mt is a novel method based on deep neural networks to predict C-to-U editing sites in angiosperm mitochondrial RNA.

License: MIT License

Python 23.45% Jupyter Notebook 76.55%
autoencoder mitochondria rna rna-editing convolutional-neural-networks convolutional-autoencoder rna-sequence angiosperms genome posttranslational-modification

deepredmt's Introduction

DOI

Deepred-Mt: Deep Representation Learning for Predicting C-to-U RNA Editing in Plant Mitochondria

This repository contains the official implementation of Deepred-Mt, along with instructions for reproducing results presented in "Deepred-Mt: Deep Representation Learning for Predicting C-to-U RNA Editing in Plant Mitochondria", by A. A. Edera, I. Small, D. H. Milone, and M. V. Sanchez-Puerta. Download PDF.

In land plants, the editosome is a highly sophisticated molecular machine able to convert post-transcriptionally cytidines into uridines (C-to-U) at highly specific RNA positions called editing sites. This RNA editing seems to be partially governed by cis elements, which still remain recalcitrant to characterization.

Deepred-Mt is a novel neural network able to predict C-to-U editing sites in angiosperm mitochondria. Given an RNA sequence, consisting of a central cytidine flanked by 20 nucleotides on each side, Deepred-Mt scores how probable its editing is.

Convolution

The score is computed from complex cis elements or motifs automatically extracted from the flanking bases by a multi-layer convolutional neural network, whose full architecture is schematically shown below.

Deepred-Mt

Submit RNA sequences for predictions

To submit RNA/DNA sequences for predicting their C-to-U editing sites with Deepred-Mt, use the following link:

Submit sequences

Note 1: To be able to submit, you must be logged in with a Google Account (e.g., Gmail).

Note 2: If difficulties are experienced when submitting sequences, try to use Google Chrome as the web browser.

If you encounter problems when submitting sequences please report an issue.

Installation

To install Deepred-Mt on your computer, the following dependencies must be installed:

First, create and activate a new Conda environment

conda create -n deepredmt python=3.7
conda activate deepredmt

Next, install Deepred-Mt from the sources

pip install -U "deepredmt @ git+https://github.com/aedera/deepredmt.git"

Usage

Command line

Once installed, Deepred-Mt can be executed on the command line to predict C-to-U editing sites from a desired FASTA file. Here is an example FASTA file called seqs.fas:

deepredmt seqs.fas

This command extracts cytidines from the FASTA file to make predictions based on their surrounding nucleotides.

Demo notebooks

The following notebooks reproduce experiments in the article.

Description Notebook
Use Deepred-Mt on the command line to predict C-to-U editing sites from a given FASTA file
Compare the predictive performance of Deepred-Mt and state-of-the art methods for predicting editing sites
Train Deepred-Mt from scratch

Data

The experiments reported in the manuscript used three datasets built from these FASTA files, extracted from nucleotide sequences encoding mitochondrial proteins from 21 plant species. In these files, 'E' nucleotides indicate C-to-U editing sites identified by using published RNAseq data, obtained from the European Nucleotide Archive.

Dataset Description
Training data 41-bp nucleotide windows whose center positions are either unedited (C) or edited (E) cytidines. Nucleotide windows are labeled according to both the nucleotide in their central positions (0/C, 1/E) and their corresponding editing extents (a value ranging from 0 to 1)
Task-related sequences Sequences used for the augmentation strategy proposed in the article. These sequences are 41-bp nucleotide windows whose center positions are thymidines homologous to one of the editing sites in the training data
Control data Control data containing fake editing signal "GGCG" within the downstream regions of nucleotide windows that are labeled as 1 (edited)

More information on the data format is provided here.

Results

In our experiments, Deepred-Mt was compared to two state-of-the-art methods for predicting editing sites: PREP-Mt and PREPACT. The following figure shows precision-recall curves obtained from the predictions of each method. Deepred-Mt achieves the highest F1 scores and the best areas under the curves (AUPRC) for two predictive scenarios: one excluding synonymous sites (dashed lines) and other including them (solid lines).

Deepred-Mt performance

Method Excluded Included
AUPRC F1 AUPRC F1
PREPACT 0.91 0.89 0.79 0.82
PREP-Mt 0.88 0.91 0.76 0.84
Deepred-Mt 0.96 0.92 0.91 0.86

Contributing

Contributions from anyone are welcome. You can start by adding a new entry here.

License

Deepred-Mt is licensed under the MIT license. See LICENSE for more details.

deepredmt's People

Contributors

aedera avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

aibolem jiangchb

deepredmt's Issues

Issue

This step may take up to 5 minutes.

WARNING: Skipping yellowbrick as it is not installed.
WARNING: Skipping xarray-einstats as it is not installed.
WARNING: Skipping imbalanced-learn as it is not installed.
WARNING: Skipping datascience as it is not installed.
WARNING: Skipping albumentations as it is not installed.
WARNING: Skipping arviz as it is not installed.
WARNING: Skipping pymc3 as it is not installed.
Running command git clone --filter=blob:none --quiet https://github.com/aedera/deepredmt.git /tmp/pip-install-8

keras version issue

Note: The higher a score is, the more likely a site is edited.

Traceback (most recent call last):
File "/usr/local/bin/deepredmt", line 8, in
sys.exit(deepredmt())
File "/usr/local/lib/python3.10/dist-packages/deepredmt/cli.py", line 34, in deepredmt
_predict_from_fasta(fin)
File "/usr/local/lib/python3.10/dist-packages/deepredmt/cli.py", line 13, in _predict_from_fasta
wins, preds = predict_from_fasta(fasin)
File "/usr/local/lib/python3.10/dist-packages/deepredmt/predict.py", line 83, in predict_from_fasta
model = tf.keras.models.load_model(tf_model, compile='False')
File "/usr/local/lib/python3.10/dist-packages/keras/src/saving/saving_api.py", line 199, in load_model
raise ValueError(
ValueError: File format not supported: filepath=/usr/local/lib/python3.10/dist-packages/deepredmt/./model/210520.tf. Keras 3 only supports V3 .keras files and legacy H5 format files (.h5 extension). Note that the legacy SavedModel format is not supported by load_model() in Keras 3. In order to reload a TensorFlow SavedModel as an inference-only layer in Keras 3, use keras.layers.TFSMLayer(/usr/local/lib/python3.10/dist-packages/deepredmt/./model/210520.tf, call_endpoint='serving_default') (note that your call_endpoint might have a different name).
Predictions finished!
Below the first 10 predictions are shown.
Pos Upstream Target Downstream Score

Request for alternative local installation method

I'm experiencing difficulties installing deepredmt using pip install due to network limitations. Could you please provide an alternative local installation method that doesn't rely on direct network access to GitHub?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.