GithubHelp home page GithubHelp logo

emdann / sc_target_evidence Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 3.0 317.72 MB

Meta-analysis of drug target evidence in single-cell data

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 99.67% Python 0.32% Shell 0.01%

sc_target_evidence's People

Contributors

emdann avatar erteeple avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

sc_target_evidence's Issues

Targeted analysis of lung diseases

  • Compare 2 diseases with and without genetic evidence (e.g. COPD and pulmonary fibrosis) - which successful targets are prioritized by single-cell evidence?
  • Why is there no genetic evidence for some of these diseases? Is it lack of GWAS studies or no significant hits?
  • Are targets with sc evidence associated with GWAS variants for lung function?

Checking the cell annotations

Check output plots for each disease in /nfs/team205/ed6/bin/sc_target_evidence/data/plots/ (or here). What each figure shows is briefly explained in DATA_INFO.

  • Check whether we need to blacklist more terms - e.g. lymphocyte
  • Check whether we need to exclude certain diseases
  • Check expression of correct marker genes - the full list of cell ontology terms used across diseases is here. See DATA_INFO for location of .h5ad files.
  • Flagging general odd things

Plot diagnostics for DE analysis

  • Volcano plots
  • ECDF of significant cell types x lfcThreshold
  • Expression of genes that are significant in a large number of cell types for the cell type specificity evidence
  • Number of significant cell types

Low quality annotation diseases

A bunch of diseases gets excluded in DE in disease analysis because no disease pseudobulk is left after filtering for low quality annotation, although in prev version of the pipeline there were cell types to do comparison:

To check

MONDO_0005575
MONDO_0006156
MONDO_0006249
MONDO_0024660
MONDO_0024661
MONDO_0024885
MONDO_0001056 # gastric cancer

API structure

  • - cellontology_utils - functions to handle cell ontologies
  • - preprocessing_utils - things like anndata2pseudobulk, adding similarity btw cell types to the pseudobulk objects, functions to save and read pseudobulk objects?
  • - cxg_utils - functions to download data and metadata using cxg census
  • - plotting_utils - functions to make diagnostic plots
  • - de_utils - functions for DE analysis (cleaning data, running DE analysis with different regimes, saving outputs)
  • - sc_evidence - transform DE analysis results to evidence for drug targets
  • - opentargets_utils - functions to clean OT datasets

Toxicology/safety evidence

The idea is to compare expression of the target in the relevant tissue with expression across usual suspects for side effects (blood, liver, heart) or surrounding organs.

The problem is defining what a surrounding organ is.

Fine vs coarse cell annotation on lung dataset

Compare targets with single-cell evidence for coarse vs fine uniformed cell annotations using lung samples/diseases, comparing ontology based coarse annotations with annotations from Extended Lung cell atlas.

  • Are the same genes identified as cell type markers and disease cell type markers?
  • Are more successful targets identified when testing on fine grained annotation?

Fix download of problematic stomach / brain files

Brain files fail because of insufficient RAM even when requesting 700GB. Download of stomach files from cxg hangs forever, looks like something is wrong in the census side.

Try downloading datasets directly from cellxgene website or sfaira, then filtering and pseudobulking.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.