GithubHelp home page GithubHelp logo

Comments (20)

ruochiz avatar ruochiz commented on August 16, 2024

If you keyboardinterrupt the code, could you attach the final error report here so I can check which part it got stuck on? Thx

My guess is that it got stuck on generate_feats for cell nodes part, but would be good to have some confirmation.

from higashi.

jshi7 avatar jshi7 commented on August 16, 2024

I unfortunately am running it as job on slurm.. so I'm not sure how I would do this (but will look into it). but node_feats.hdf5 is definitely not complete b/c I tried running the next step (prep_model) and it gave error because of that file

from higashi.

jshi7 avatar jshi7 commented on August 16, 2024
image Here is the error!

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Is this the beginning of the error? Usually the actual error part from a multiprocessing job would be on the top.
I'm guessing maybe it is caused by the multiprocessing setting which I can add a patch quickly, but would be good to confirm which part causes the error.

from higashi.

jshi7 avatar jshi7 commented on August 16, 2024
image image Sorry about this; I attached the full error!

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Hey I just commit a patch for this, could you check to see if that fixes the error. So, after you pull the repo, update the package, for higashi_model.process_data() could you do higashi_model.process_data(disable_mpl=True) instead, to disable multiprocessing for creating sparse matrices and see if that fixes the bug. Thanks

from higashi.

jshi7 avatar jshi7 commented on August 16, 2024

Thank you for your help! unfortunately, I think it's still getting stuck even after I git pulled and set disable_mpl=True

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Hum.. Can you kill it and attach the full error information. I think now it would show what the actual part it stuck on

from higashi.

jshi7 avatar jshi7 commented on August 16, 2024

Actually I tried rerunning it and it worked this time!! thank you so much for your help!!

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Oh... Cool. So it is the multiprocessing stuff.

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

@ruochiz Hello! I try to use higashi_model.process_data(disable_mpl=True) to solve the problem,
but I got an error TypeError: process_data() got an unexpected keyword argument 'disable_mpl'
my higashi version is higashi 0.1.1a1 py_0
What can I do to solve this problem?

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

@ruochiz Hello! I try to use higashi_model.process_data(disable_mpl=True) to solve the problem, but I got an error TypeError: process_data() got an unexpected keyword argument 'disable_mpl' my higashi version is higashi 0.1.1a1 py_0 What can I do to solve this problem?

I changed the default disable_mpl value in Process.py, but it didn't work
from def create_matrix(config, disable_mpl=False) to def create_matrix(config, disable_mpl=True)

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Could you try to run it with just higashi_model.process_data() see if that fixes the issue? Thx!

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

image
It didn't work, should I change the default disable_mpl value in Process.py back to False?
(different datasets have different results. In another dataset, it worked but very slowly)
image
I found the Process.py in higashi 0.1.1a1 py_0 doesn't have the parameter disable_mpl, I download Process.py from GitHub to cover the local Process.py and then change the default value.

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Right. The disable_mpl options would disabled multiprocessing, and only use one thread to run the program, hence it's expected to be slow. It's mainly used for debugging purpose (and not included in any release version). I added this option because error in child processes sometimes are not printed in the main program.
Usually the error of multiprocessing stucks happens on windows machine.

It's suggested that by default, one uses the program with mpl (so disable_mpl=False). Did you find the code stucked at the processing stage?

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

Yes, the code is stuck at the processing stage higashi_model.process_data(), I run the process_data() 50 minutes ago, but only finished 5%. (GPU number: 5)
image

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Hum. The process_data does not use gpu. The creating matric tasks can be slow depending on the parameters. What's the number of cells, resolution of the contact map, and did you end up using disable_mpl or not?

And, when the code is running, could you check htop or top to see if Higashi is using multiple cores efficiently or not? Thanks!

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

I run the Higashi on the GPU node. There are about 7000 cells, and the resolution is 50k. The value of parameter disable_mpl was set False (I checked Higashi was using multiple cores). Maybe the resolution is too small?

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

These all sounds like reasonable parameters. 50k is a bit small, but Higashi should be able to handle that.
It does seem like there are only 3,103,646 contacts for 7k cells, so on average each cell gets ~360 non-zero entries, which is really small.
For comparisons, one of the sparser scHiC dataset, the 4DN sciHi-C dataset has ~103,497,337 for 16,707 cells, and are usually processed at 500kb or 1Mb resolution.

Just for debugging purpose, I tested to process 4DN sciHi-C dataset at 50kb on a GPU node with about 32 CPU threads, the running speed is roughly like this (the config name says 500k, but I changed it to 50kb as shown):

CleanShot 2023-08-22 at 22 37 11
CleanShot 2023-08-22 at 22 58 55

For troubleshooting, if you are comfortable, you can share a fraction of the dataset that can replicate this behavior here: [email protected] and I can help to take a look later.

from higashi.

XiongGZ avatar XiongGZ commented on August 16, 2024

Relate data has been sent to you. Thanks for your patience!

from higashi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.