Hey Ruochi, Previously I tried using Higashi using the config file(copied below),

You spotted a bug, 😄 . It should be the cell names instead of all "1", I'll fix

Haha, glad I did! (Also there is a {bottle_neck}</code

It's a parameter that has been deprecated and will no longer be used in the prog

"chrom_list" error about higashi HOT 7 CLOSED

ma-compbio commented on August 16, 2024

"chrom_list" error

from higashi.

Comments (7)

ruochiz commented on August 16, 2024

Ah. I see. The config file should not be wrapped with [[ ]].
BTW, there are duplicated entries in both "chrom_list" and "impute_list" in your shared config file.

I am currently also preparing a tutorial on a toy example dataset. But in the meantime, I'll share the link to the toy example with the corresponding config file with you here so you can mimic the structure of the input files.

https://drive.google.com/drive/folders/1NrKGRUKzcG_jfDjXV6qiYaaWSDg-XM-S?usp=sharing

Feel free to make suggestions if the documentation looks unclear or confusing to you. Thanks!

from higashi.

tarak77 commented on August 16, 2024

A tutorial on the toy data sounds great!
Just looking at the data.txt in the toy example....why do you keep the first column the same? I suppose there are 1087 cells, but your first column has a "1" throughout?

My goal: to see any sort of clusters in my data just by using all the intrachromosomal contact information from all the chromosomes
Also, could you explain how to use the parameter "loss_mode" for my goal?
And also the ideal way to use the training step (step3 ) of higashi?
And also what output to use to visualize the embedding? (I am a bit confused with what you have written in the wiki)

Thanks again!

from higashi.

ruochiz commented on August 16, 2024

You spotted a bug, 😄 . It should be the cell names instead of all "1", I'll fix that today.
"rank" stands for ranking mode while "classification" stands for classification mode. If your dataset at the give resolution, most of the non-zero entries (e.g. > 80%) are just 1, then the classification mode would be sufficient as it just learn to predict nonzero entries. If your non-zero entries has a good continuous span, then the ranking mode would further learn the order of the values in the contact maps.
I don't quite understand the question. Do you just want to train Higashi at step 3? Higashi trains subsequently, the embeddings from step 1 would be important for step 3 as well. We do have options to skip the imputation after step 2. I'm also testing if skipping step 2 would not affect step 3 results as much. Will update the code base once that finished.
The {embedding_name}_0_origin.npy at {temp_dir} is the cell embeddings. You could also use the Higashi_vis to inspect both the embeddings and imputation results. I'll change the documentation to make it clearer

from higashi.

tarak77 commented on August 16, 2024

Haha, glad I did! (Also there is a {bottle_neck} parameter in the config_toy.json file. What is that?)
I understand.
Sorry, what I meant was the training step of Higashi(Step3 from the wiki page and not -s 3). So I have started training the model giving -s 1. This will run all the steps subsequently.
Oh I see now. In my {temp_dir} I can see the *_0_origin.npy file and also the *_origin.npy files for all of my training chromosomes. So that's what made me confused.
At present, the code is running at epoch 10 (out of 15). I will probably let them all finish before downstream analysis on the embeddings...

from higashi.

ruochiz commented on August 16, 2024

It's a parameter that has been deprecated and will no longer be used in the program
Yes. Your understanding is correct
The first 15 epochs are just using bias effects (distance, batch id, chromosome id, cell coverage) to regress out the bias. There will be the following 60 epochs to get the real embeddings

from higashi.

tarak77 commented on August 16, 2024

I see. Can I then use UMAP on the {embedding_name}_0_origin.npy data?

from higashi.

ruochiz commented on August 16, 2024

Yes.

from higashi.

"chrom_list" error about higashi HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs