crazyhottommy / chip-seq-analysis Goto Github PK

View Code? Open in Web Editor NEW

714.0 714.0 293.0 4.35 MB

ChIP-seq analysis notes from Ming Tang

License: MIT License

Python 100.00%

chip-seq histone-modifications transcription-factor-binding

chip-seq-analysis's Introduction

Hi there 👋

I have over 12 years of computational biology experience, 6 years of molecular cancer biology and 4 years of immunology experience.
I am a computational biologist working on (single-cell) genomics, epigenomics and (spatial) transcriptomics.
I use machine learning approaches to find new drug targets for cancer patients;
I use google cloud and Terra for large scale data processing;
I use R primary for data wrangling and visualization in the tidyverse ecosystem;
I use python for writing Snakemake workflows and reformatting data;
I am a unix geek learning shell tricks almost every month; I care about reproducible research and open science.

Learn more about me at https://tommytang.bio.link/

Join my FREE newsletter to learn computational biology https://divingintogeneticsandgenomics.ck.page/newsletter

Subscribe to my chatomics youtube channel https://www.youtube.com/@chatomics

You do not need to do a master's degree to learn computational biology. Grab my book to learn computation at here

chip-seq-analysis's People

Stargazers

Watchers

Forkers

federicomarini cauyrd lingdudefeiteng vd4mmind xtmgah fjrossello vov-bio yuanchuntian siyunw rfarouni jmrinaldi ustcahwry sammichan1119 jmzeng1314 readbio kipkurui simonsis mgalland tiramisutes zqfang yiyansong genomicsnx kylescotshank ggfengxuan snashraf lry198010 ashwini06 leangreen gaigaiguo amalthomas111 ppsg sdhmdhr123 lw3259111 wingfly1234 wuwill jchenpku zhiyil joshuabhk xuanheiiis wisekh6 aiminy junhuang023 xflicsu yixf-self vsnishtala biocodings l0yang05 smartgamer dolittle007 dbrg77 rmwthorne triangularcell demis001 kyounghyoun treywea h-samee nyc9981 donalbonny jihedc bioinformaticsmaterials vreuter ssayols rosaak xchromosome219 mugglea gyanmishra zefeng-wu wang-tianpeng p-anand tintingli hrk2109 juzheng87 pipelines-jusue404 wangbao0716 leipinji yinyeya yodeng akhileshkaushal sunqiangzai ytlogos kevinrue yunkyunp rintukutum alfredyu2017 yf0205 functional-kuangc fanying2015 yanwuguo yangj9932 smyang2018 koesgroup biomlboston anu-bioinfo yuhongning gyd1990 kkyinli tongww0312 drychkov shaoshuaipku htnani

chip-seq-analysis's Issues

SICER-rewrite

Dunno if you are interested, but if yes, you can find it at https://github.com/endrebak/epic

the blog link not working

hello , crazyhottommy
I want to visit your blog, but the links are not working now. Do you change it ?
Links : http://crazyhottommy.blogspot.jp/search/label/CpG
http://crazyhottommy.blogspot.jp/search/label/MeDIP-seq

How to average genome-wide data link not working

This is probably what you meant to link to: https://liorpachter.wordpress.com/2015/07/13/how-to-average-genome-wide-data/ ?

How to add gene names on the SE plot from ROSE?

Hi @crazyhottommy,
Can you please guide, how can I add gene names to the Inflection plot from ROSE for super enhancers?
I'll appreciate your help.
Thanks

Difference between Diffbind library size normalisation vs DESeq2 library size normalisation

Dear Tommy,

I am an avid follower of you ChIPseq tutorials and blogs. I am now looking at part 3 of your tutorial, which is, using Diffbind or DESeq2 for analysis. I was using DiffBind until now but want to switch to deseq2 as I want to control for multiple covariates, which is not currently offered by DiffBind package (only one blocking factor at a time). I have the raw counts generated and I can do the full analysis, however, I have a question regarding library size normalisation .DiffBind by default, does it. DESeq2 vignette also suggests, it does library size normalisation. But the difference that I find is, Diffbind takes the library size information from the BAM files and uses that, which is probably total mapped reada. In terms of DESeq2, since it doesn't have the bams, it probably do the colum wise sample read count sum to get the library size. Fundamentally, will they be different or not?
My main point is, can I trust the deseq2 library size normalisation method as opposed to Diffbind way of library size normalisation?

Can ROSE handle multiple samples together?

Hi @crazyhottommy
I am trying to use ROSE to call Super Enhancer between the two subtypes of a cancer, where each group has 5 samples. But I am not sure how to identify potential SE by handling multiple samples (together with ROSE) from each subtype and to compare these with the other subtype.
Do you have any suggestions, I'll really appreciate your help.
Thanks

Missing summit location when run macs2 with --broad option

Hi, tommy

Thanks for your post about MACS, it helps me a lot!

Actually I usually use Homer to call histone modification peaks. Recently, I need to use macs to "validate" my previous results. When I run

macs2 callpeak -t test_rep1.bam -n test --broad -p 0.01 --nomodel --extsize 147 -g 1.4e9

The output files don't have NAME_summits.bed and there is no column of "absolute peak summit position" in NAME_peaks.xls. Did I miss some specific options?

Or do you happen to know how can I quickly identify the summit location in a bed file(The most enriched signal location in a region)? I am new to bioinformatics, I write a small script to calculate the 100bp bin region FPKM then screen the max value...but it is very slow(of course)..

Thanks for your help in advance!

NCIS scaling script

I was trying to use NCIS_scaling .R script
It doesn’t happen to work on my bam files.
problem is happening in this line

ga_ChIP<- readAligned(bamFile, type="BAM")
arugment 'type' had value 'BAM' allowable values: 'SolexaExport'
'SolexaAlign' 'SolexaPrealign' 'SolexaRealign' 'SolexaResult'
'MAQMap' 'MAQMapShort' 'MAQMapview' 'Bowtie' 'SOAP'

And still when i give Bowtie argument in type. Its not working

epic2 out in bioinformatics

https://github.com/biocore-ntnu/epic2

https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btz232/5421513

AFAIK it is by far the fastest ChIP-Seq caller :) It also fixes a bug in SICER.

Btw, you might be interested in this: https://github.com/biocore-ntnu/pyranges

Since you are using Python sometimes with Snakemake it might be more convenient than R GenomicRanges (which also is very nice :) )

Would love to see one of these for ATAC-Seq!

How ROSE work？

ROSE uses 12.5 kb to indentify super-enhancer. Is that mean the enhancer is changed to 12.5kb??
THANKs!

Gene set enrichment on IDR peaks

Hello,

I ran ENCODE's ChIPSeq pipeline in "tf" mode. I want to do some gene set enrichment analysis on the IDR reproducible peaks. I have tried to work with the tools you mentioned such as chipenrich, however I do not get any terms that are enriched. Do you know of any tool that could help me?

Thank you,
Asma

crazyhottommy / chip-seq-analysis Goto Github PK

chip-seq-analysis's Introduction

Hi there 👋

chip-seq-analysis's People

Stargazers

Watchers

Forkers

chip-seq-analysis's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs