GithubHelp home page GithubHelp logo

garimasingh128 / ovarian-cancer-subtypes-identification Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 1.0 2.71 MB

🧠Deep Learning-based✔️ Ovarian Cancer🙅🏻‍♀️ Subtypes Identification using Multi-Omics Data✅

Home Page: https://biodatamining.biomedcentral.com/articles/10.1186/s13040-020-00222-x

License: Apache License 2.0

Jupyter Notebook 100.00%
biodata biodatamining ovarian-cancer deep-learning multi-omics

ovarian-cancer-subtypes-identification's Introduction

🚀Ovarian-Cancer-Subtypes-Identification💻

🧠Deep Learning-based✔️ Ovarian Cancer🙅🏻‍♀️ Subtypes Identification using Multi-Omics Data✅

Author Author License Platform Maintained

🚀Contributor Details:💻

  • Garima Singh (1806143)
  • Mrinal (1806149)

🚀Paper Details💻

Published:✨

24 August 2020

Authors:✨

  • Long-Yi Guo
  • Ai-Hua Wu,
  • Yong-xia Wang,
  • Li-ping Zhang,
  • Hua Chai
  • Xue-Fang Liang

Institutions:✨

  • Second School of Clinical Medicine, Guangzhou University of Chinese Medicine, Guangzhou, 510020, China
  • Center for Reproductive Medicine, Guangdong Hospital of Traditional Chinese Medicine, Guangzhou, 510120, China
PMID: 32863885
PMCID: PMC7447574
DOI: 10.1186/s13040-020-00222-x

Model implemented by us👣

The model used by us is a logistic regression classification model which uses K-means clustering techniques. We have chosen to align with the paper specifications and techniques as closely as possible but changed certain values to improve accuracy score. The multi-omics data of patients are inputted into the Denoising Autoencoder for generating z. With the help of generated z, the patients are clustered using k-means. The optimal number of clusters was determined using silhouette score. In the model, we tested the k from [2, 8] and we finally used k=2 as it had the highest silhouette score. After obtaining the labels clustered by k-means, we built a light-weighted mRNA model for reducing the number of genes needed to identify cancer subtypes by using a logistic regression algorithm.

Code walk-through👣

This is the python code for denoising autoencoders (DAE)🚀 and the k-means using the reconstructed features. The silhouette scores and Davies Bouldin scores (DBI) were used to evaluate the clustering performances.

Sources👣

Conclusion👣

  • We designed a novel deep learning-based framework for ovarian cancer subtype identification, and a logistic regression method was used to build the light-weighted classification model.
  • Compared to identifying subtypes using single omics data, the multi-omics data analysis can utilize more information. Hence, we proposed a model which in turn would help to robustly identify ovarian cancer subtypes.
  • Ovarian cancer ranks 5th in cancer death among women. It has a high mortality rate. Also the risk of getting ovarian cancer is quite high. So, identifying molecular subtypes of ovarian cancer is important.

🚀 It is important to know more about the ovarian cancer heterogeneity between different patients for choosing different treatment programs and predicting clinical outcomes. In this study we proposed a novel deep learning framework for integrating multi-omics data with denoising autoencoders for identifying the ovarian cancer subtypes. Two subtypes from the molecular level were identified in ovarian cancer, and the results show our proposed method is competitive and reliable. The method comparison results indicated our method out-performed than the traditional and deep learning-based methods. More importantly, the classification model was proved by three independent test datasets collected from GEO. All the p-values less than 0.05 show that the differences between the classified cancer subgroups are significant.

By combining the results in DEG and WGCNA analysis, we selected 34 target genes related to ovarian cancer. And using these 34 identified genes, 19 KEGG pathways were enriched including PI3K-Akt signaling pathway and human papillomavirus infection pathway. The literature review shows 19 (56%) biomarkers and 8(42.1%) KEGG pathways identified based on the classification subtypes have been proved to be associated with ovarian cancer.

References👣

Availability of data and materials👣

All the data analyzed during the current study are available in the TCGA and GEO datasets.

🔆 Tech Stack

The project is created using Python in Jupyter lab.

🚀 Steps to setup development environment

  1. Clone the repo
git clone github.com/your_usernameOvarian-Cancer-Subtypes-Identification.git
  1. Open the folder in your favorite code editor and start adding modifications.

💻 Development guidelines

  1. Put all the code you want to in necessary folders.

  2. Push all the code to your own branch. Once you are sure it is working, merge it with the dev branch. Let's maintain only the stable and released versions on the master branch.

  3. Write a kick-ass, readable, and clean code.

📝 Learning Resources

Read these articles to get a quick grab on Git and Github:

📝 Resources to learn Git and Github: 📝

Feel free to create issues to suggest and add functionalities and features.

💻 System Requirements

  • Google Chrome
  • Git
  • Code Editor (Jupyter preferrably)
  • Python

🏆 Contributing

Please read CONTRIBUTING.md for information on how to contribute to Ovarian-Cancer-Subtypes-Identification.

💼 Code of Conduct

We want to facilitate a healthy and constructive community behavior by adopting and enforcing our code of conduct.

Please adhere towards our code-of-conduct.md.

👬 Owner


Garima Singh


Mrinal Kumar

built with love

❤️ Thanks to our awesome contributors.

ovarian-cancer-subtypes-identification's People

Contributors

garimasingh128 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

srijanshovit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.