GithubHelp home page GithubHelp logo

Comments (13)

timothy-barry avatar timothy-barry commented on August 19, 2024

Hi Yan,

Do you mind telling me a bit more about the analysis you are trying to run? For example, are you seeking to deploy the sceptre pipeline to analyze the Gasperini 2019 data?

The data in the Dropbox repo (which are associated with the "Robust DE testing..." preprint) are in ondisc version 1.1.0 format. You can download ondisc version 1.1.0 via install_github("timothy-barry/ondisc", ref = "v1.1"). This should allow you to access the Gasperini data stored in the Dropbox repo.

However, note that ondisc version 1.1.0 is not compatible with the sceptre Nextflow pipeline. Instead, the Nextflow pipeline requires ondisc version 1.2.0 (i.e., the current version of ondisc). I can supply you with the Gasperini data stored in ondisc version 1.2.0 if that would be helpful.

Tim

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

Hi, Tim

Thank you for the prompt reply.

Definitely, I am intersted in some enhancers in K562 cell line and seeking to deploy the sceptre pipeline to analyze the Gasperini 2019 data.
The Gasperini data stored in ondisc version 1.2.0 you said are the input file mentioned in import-gasperini-2019-v3 repos ?

└── at-scale
    ├── processed
    │   ├── grna.odm
    │   ├── response.odm
    │   ├── sceptre_object.rds
    │   └── sceptre_object_in_memory.rds

If yes, Gasperini data stored in ondisc version 1.2.0 are exactly what I need. You will share me a dropbox link, right ? I am so grateful for your help !!!

Yan

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

Hi Yan,

Got it. We were planning to do a trans analysis of the Gasperini data as part of our current project. I think I will get to this in about a week or two, probably. Perhaps I could send you the data and preliminary results at that time?

Tim

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

Hi Yan,

Got it. We were planning to do a trans analysis of the Gasperini data as part of our current project. I think I will get to this in about a week or two, probably. Perhaps I could send you the data and preliminary results at that time?

Tim

Thank you Tim. I am looking forward to your results.

Yan

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

Hi Tim,
I followed the import-gasperini-2019-v3 pipeline to generate these files:

(sceptre) [liangyan@mu01 at-scale]$ tree processed/ 1.sceptre_res/
processed/
├── grna.odm
├── pairs_grouped.rds
├── response.odm
├── sceptre_object_mem.rds
└── sceptre_object.rds
1.sceptre_res/
├── 1.runSCEPTRE_pipeline.sh
├── 1.sceptre_outputs
│   ├── analysis_summary.txt
│   ├── grna_assignment_matrix.rds
│   ├── plot_assign_grnas.png
│   ├── plot_grna_count_distributions.png
│   ├── plot_run_calibration_check.png
│   ├── plot_run_discovery_analysis.png
│   ├── plot_run_power_check.png
│   ├── plot_run_qc.png
│   ├── results_run_calibration_check.rds
│   ├── results_run_discovery_analysis.rds
│   └── results_run_power_check.rds
├── 1.sceptre.R
├── 1.sgRNA_DEanalysis.csv
├── 1.sgRNA_info.csv
├── 302669.err
└── 302669.out

Interestingly, I found that only on chromosome 9, there are some siginicant enhancer-gene pairs which is whose distance is much greater than 500kb. For example, the UBAC1 gene located far away form the two enhancer. It seems strange, isn't it?
image

However, the GSE120861_gene_gRNAgroup_pair_table.at_scale.txt do not inclue the enhancer-gene pairs.

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

Hi Lyon,

Could you provide some details about the analysis that you ran? For example, did you run a cis analysis or trans analysis (I'm guessing the former)? Were you using the ondisc-backed sceptre_object or the standard sceptre_object? And did you use the R console interface or the Nextflow pipeline?

I am not sure about the enhancer located at ~chr9:135879214, but my suspicion is that this enhancer regulates a nearby transcription factor, which in turn regulates UBAC1.

By the way, I plan to run a trans analysis on the Gasperini data today. Perhaps we could discuss more when I obtain those results.

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

I used the ondisc-backed sceptre_object just in the R console interface. Actually, I run an cis-analysis, thus I do not think for enhancer located at ~chr9:135879214, the UBAC1 gene should be tested.

By the way, the gene-sgRNA pairs table GSE120861_gene_gRNAgroup_pair_table.at_scale.txt do not include the ~chr9:135879214 enhancer-UBAC1 pairs.

This is the the code I run, following get started with sceptre.

library(sceptre)
library(dplyr)
library(conflicted)
conflict_prefer("filter", "dplyr")

gasp_offsite <- paste0("/public/home/liunangroup/liangyan/project/tf-assymmetry/20240415_CreType-diff/CRISPR_screen/import-gasperini-2019-v3/",
                       "at-scale/")
processed_data_dir <- paste0(gasp_offsite, "processed/")

sceptre_object <- read_ondisc_backed_sceptre_object(
  sceptre_object_fp = paste0(processed_data_dir, "sceptre_object.rds"),
  response_odm_file_fp = paste0(processed_data_dir, "response.odm"),
  grna_odm_file_fp = paste0(processed_data_dir, "grna.odm"))
sceptre_object

write.csv(sceptre_object@grna_target_data_frame,'1.sgRNA_info.csv',row.names=F,col.names=T)

positive_control_pairs <- construct_positive_control_pairs(sceptre_object)
discovery_pairs <- construct_cis_pairs(
  sceptre_object, 
  positive_control_pairs = positive_control_pairs)

# apply the pipeline functions to the sceptre_object in order
sceptre_object <- sceptre_object |> # |> is R's base pipe, similar to %>%
  set_analysis_parameters(discovery_pairs, positive_control_pairs,resampling_mechanism="permutations") |>
  run_calibration_check(parallel = TRUE) |>
  run_power_check(parallel = TRUE) |>
  run_discovery_analysis(parallel = TRUE)

# write the results to disk
write_outputs_to_directory(sceptre_object, "1.sceptre_outputs")
thread='32'
sbatch --job-name=job --partition=cu --nodes=1 --ntasks-per-node=$thread --output=%j.out --error=%j.err --wrap="
/public/home/liunangroup/liangyan/software/miniconda3/envs/sceptre/bin/Rscript 1.sceptre.R
"

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

I guess I find the reason.
sceptre using coordinate of gene position in hg38 reference genome. However, the results of gasperini-2019 based on hg19 genome.

Actually, the TSS of UBAC1 in hg38 is Chromosome 9: 135,932,969-135,961,373, locating at the cis-region of ~chr9:135879214

?construct_cis_pairs

#Usage:
#
#     construct_cis_pairs(
#       sceptre_object,
#       positive_control_pairs = data.frame(),
#       distance_threshold = 500000L,
#       response_position_data_frame = gene_position_data_frame_grch38)

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

Ah, I see. The sceptre high-MOI example uses gene_position_data_frame_grch38 in construct_cis_pairs(), but really it should be using gene_position_data_frame_grch19. 😳 I will add gene_position_data_frame_grch19 to the package and fix the example. Thank you for noticing this!

By the way, I ran a trans analysis of the Gasperini data using the sceptre pipeline. Note that the trans analysis pairs all gRNAs to all genes; thus, the issue with the construct_cis_pairs() function is not relevant here. You can download the results here: https://www.dropbox.com/scl/fo/q2sasfxtxzprsfazyjbdm/AOjk2GDhhyjzdZcdpk2QuTs?rlkey=2drsij7e1l2wytaaqfxg0ro6h&st=5jvgi08f&dl=0. Note that these results are preliminary, and we will officially release these results as part of our upcoming manuscript. The book provides instructions for interacting with the .parquet files.

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

Oh, I should add that I randomly grouped the NT gRNAs into groups of size two for this analysis. I then tested each NT gRNA group against the entire set of genes. The NT gRNA group names begin with "nt_".

from ondisc.

L1angyan avatar L1angyan commented on August 19, 2024

Hi Tim,
Thank you for your help. Looking forward to your new work.

By the way, I guess the purpose of grouping NT sgRNAs is acquiring some baseline effect of negtive control for hypothesis testing, so I do not need to pay attention to the groups begin with "nt_", right?

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

Yes, that is right, you can ignore the gRNA targets that begin with "nt_". (However, if you want to convince yourself that sceptre is doing the right thing on these data --- i.e., controlling type-I error --- then you may want to look at the "nt_" gRNA targets).

from ondisc.

timothy-barry avatar timothy-barry commented on August 19, 2024

(Note that sceptre does not use the NT gRNAs to test hypotheses or compute p-values; rather, sceptre uses the NT gRNAs to verify control of type-I error/FDR.)

from ondisc.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.