GithubHelp home page GithubHelp logo

"chrom_list" error about higashi HOT 7 CLOSED

ma-compbio avatar ma-compbio commented on August 16, 2024
"chrom_list" error

from higashi.

Comments (7)

ruochiz avatar ruochiz commented on August 16, 2024

Ah. I see. The config file should not be wrapped with [[ ]].
BTW, there are duplicated entries in both "chrom_list" and "impute_list" in your shared config file.

I am currently also preparing a tutorial on a toy example dataset. But in the meantime, I'll share the link to the toy example with the corresponding config file with you here so you can mimic the structure of the input files.

https://drive.google.com/drive/folders/1NrKGRUKzcG_jfDjXV6qiYaaWSDg-XM-S?usp=sharing

Feel free to make suggestions if the documentation looks unclear or confusing to you. Thanks!

from higashi.

tarak77 avatar tarak77 commented on August 16, 2024

A tutorial on the toy data sounds great!
Just looking at the data.txt in the toy example....why do you keep the first column the same? I suppose there are 1087 cells, but your first column has a "1" throughout?

My goal: to see any sort of clusters in my data just by using all the intrachromosomal contact information from all the chromosomes
Also, could you explain how to use the parameter "loss_mode" for my goal?
And also the ideal way to use the training step (step3 ) of higashi?
And also what output to use to visualize the embedding? (I am a bit confused with what you have written in the wiki)

Thanks again!

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024
  1. You spotted a bug, 😄 . It should be the cell names instead of all "1", I'll fix that today.
  2. "rank" stands for ranking mode while "classification" stands for classification mode. If your dataset at the give resolution, most of the non-zero entries (e.g. > 80%) are just 1, then the classification mode would be sufficient as it just learn to predict nonzero entries. If your non-zero entries has a good continuous span, then the ranking mode would further learn the order of the values in the contact maps.
  3. I don't quite understand the question. Do you just want to train Higashi at step 3? Higashi trains subsequently, the embeddings from step 1 would be important for step 3 as well. We do have options to skip the imputation after step 2. I'm also testing if skipping step 2 would not affect step 3 results as much. Will update the code base once that finished.
  4. The {embedding_name}_0_origin.npy at {temp_dir} is the cell embeddings. You could also use the Higashi_vis to inspect both the embeddings and imputation results. I'll change the documentation to make it clearer

from higashi.

tarak77 avatar tarak77 commented on August 16, 2024
  1. Haha, glad I did! (Also there is a {bottle_neck} parameter in the config_toy.json file. What is that?)
  2. I understand.
  3. Sorry, what I meant was the training step of Higashi(Step3 from the wiki page and not -s 3). So I have started training the model giving -s 1. This will run all the steps subsequently.
  4. Oh I see now. In my {temp_dir} I can see the *_0_origin.npy file and also the *_origin.npy files for all of my training chromosomes. So that's what made me confused.
    At present, the code is running at epoch 10 (out of 15). I will probably let them all finish before downstream analysis on the embeddings...

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024
  1. It's a parameter that has been deprecated and will no longer be used in the program
  2. Yes. Your understanding is correct
  3. The first 15 epochs are just using bias effects (distance, batch id, chromosome id, cell coverage) to regress out the bias. There will be the following 60 epochs to get the real embeddings

from higashi.

tarak77 avatar tarak77 commented on August 16, 2024

I see. Can I then use UMAP on the {embedding_name}_0_origin.npy data?

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Yes.

from higashi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.