My collaborator and I would like to use Variant Spark to identify genetic variants ass

run on single chromosomes? about variantspark HOT 1 CLOSED

sshanshans commented on June 10, 2024

run on single chromosomes?

from variantspark.

Comments (1)

rocreguant commented on June 10, 2024

Hi Shan,
You don't need to make a unique file, you can merge them during running time using hail commands before running VariantSpark (VS).

From your description that's probably going to require a lot of RAM memory, also the more (fast) CPUs the better.

If your cluster allows, it would be most beneficial to run everything at once. By running it all together VS will be able to extract extra information, like cross-chromosome epistatic interactions, that otherwise could not. However, computers have physical limitations so the second best way is to run it chromosome by chromosome and aggregate the analysis.

Another way would be to prune the dataset. You could remove highly correlated mutations using linkage disequilibrium. This way you'd have a "cleaner" dataset. Also, You could use VS on a two-step process. First, each individual chromosome to remove all variants that have no to little importance, and then, run the complete dataset with the variants that passed the importance threshold.

I hope that helps :)

from variantspark.

run on single chromosomes? about variantspark HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs