GithubHelp home page GithubHelp logo

crazyhottommy / chip-seq-analysis Goto Github PK

View Code? Open in Web Editor NEW
714.0 714.0 293.0 4.35 MB

ChIP-seq analysis notes from Ming Tang

License: MIT License

Python 100.00%
chip-seq histone-modifications transcription-factor-binding

chip-seq-analysis's Introduction

Hi there 👋

  • I have over 12 years of computational biology experience, 6 years of molecular cancer biology and 4 years of immunology experience.
  • I am a computational biologist working on (single-cell) genomics, epigenomics and (spatial) transcriptomics.
  • I use machine learning approaches to find new drug targets for cancer patients;
  • I use google cloud and Terra for large scale data processing;
  • I use R primary for data wrangling and visualization in the tidyverse ecosystem;
  • I use python for writing Snakemake workflows and reformatting data;
  • I am a unix geek learning shell tricks almost every month; I care about reproducible research and open science.

Learn more about me at https://tommytang.bio.link/

Join my FREE newsletter to learn computational biology https://divingintogeneticsandgenomics.ck.page/newsletter

Subscribe to my chatomics youtube channel https://www.youtube.com/@chatomics

You do not need to do a master's degree to learn computational biology. Grab my book to learn computation at here

bookcover

chip-seq-analysis's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chip-seq-analysis's Issues

Difference between Diffbind library size normalisation vs DESeq2 library size normalisation

Dear Tommy,

I am an avid follower of you ChIPseq tutorials and blogs. I am now looking at part 3 of your tutorial, which is, using Diffbind or DESeq2 for analysis. I was using DiffBind until now but want to switch to deseq2 as I want to control for multiple covariates, which is not currently offered by DiffBind package (only one blocking factor at a time). I have the raw counts generated and I can do the full analysis, however, I have a question regarding library size normalisation .DiffBind by default, does it. DESeq2 vignette also suggests, it does library size normalisation. But the difference that I find is, Diffbind takes the library size information from the BAM files and uses that, which is probably total mapped reada. In terms of DESeq2, since it doesn't have the bams, it probably do the colum wise sample read count sum to get the library size. Fundamentally, will they be different or not?
My main point is, can I trust the deseq2 library size normalisation method as opposed to Diffbind way of library size normalisation?

Can ROSE handle multiple samples together?

Hi @crazyhottommy
I am trying to use ROSE to call Super Enhancer between the two subtypes of a cancer, where each group has 5 samples. But I am not sure how to identify potential SE by handling multiple samples (together with ROSE) from each subtype and to compare these with the other subtype.
Do you have any suggestions, I'll really appreciate your help.
Thanks

Missing summit location when run macs2 with --broad option

Hi, tommy

Thanks for your post about MACS, it helps me a lot!

Actually I usually use Homer to call histone modification peaks. Recently, I need to use macs to "validate" my previous results. When I run

macs2 callpeak -t test_rep1.bam -n test --broad -p 0.01 --nomodel --extsize 147 -g 1.4e9

The output files don't have NAME_summits.bed and there is no column of "absolute peak summit position" in NAME_peaks.xls. Did I miss some specific options?

Or do you happen to know how can I quickly identify the summit location in a bed file(The most enriched signal location in a region)? I am new to bioinformatics, I write a small script to calculate the 100bp bin region FPKM then screen the max value...but it is very slow(of course)..

Thanks for your help in advance!

NCIS scaling script

I was trying to use NCIS_scaling .R script
It doesn’t happen to work on my bam files.
problem is happening in this line

ga_ChIP<- readAligned(bamFile, type="BAM")
arugment 'type' had value 'BAM' allowable values: 'SolexaExport'
'SolexaAlign' 'SolexaPrealign' 'SolexaRealign' 'SolexaResult'
'MAQMap' 'MAQMapShort' 'MAQMapview' 'Bowtie' 'SOAP'

And still when i give Bowtie argument in type. Its not working

How ROSE work?

ROSE uses 12.5 kb to indentify super-enhancer. Is that mean the enhancer is changed to 12.5kb??
THANKs!

Gene set enrichment on IDR peaks

Hello,

I ran ENCODE's ChIPSeq pipeline in "tf" mode. I want to do some gene set enrichment analysis on the IDR reproducible peaks. I have tried to work with the tools you mentioned such as chipenrich, however I do not get any terms that are enriched. Do you know of any tool that could help me?

Thank you,
Asma

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.