Comments (20)
If you keyboardinterrupt the code, could you attach the final error report here so I can check which part it got stuck on? Thx
My guess is that it got stuck on generate_feats for cell nodes part, but would be good to have some confirmation.
from higashi.
I unfortunately am running it as job on slurm.. so I'm not sure how I would do this (but will look into it). but node_feats.hdf5 is definitely not complete b/c I tried running the next step (prep_model) and it gave error because of that file
from higashi.
![image](https://private-user-images.githubusercontent.com/29166540/237270700-c5aeebc2-4d90-46e3-a430-ef6cac48642f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE4MTgyMjMsIm5iZiI6MTcyMTgxNzkyMywicGF0aCI6Ii8yOTE2NjU0MC8yMzcyNzA3MDAtYzVhZWViYzItNGQ5MC00NmUzLWE0MzAtZWY2Y2FjNDg2NDJmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI0VDEwNDUyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTM4YjcxZTAxOTQzZGYxMjg5ZmExZWZkZDJjNTk5NWMyNTQwMTExMjI3ZWYwMWQyNGNmNWRkOWFmZGQ5NjQ5NTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.aJTts78WD1ICnjLln38WjRWna8PUKcCARJtkxkpDZ7Q)
from higashi.
Is this the beginning of the error? Usually the actual error part from a multiprocessing job would be on the top.
I'm guessing maybe it is caused by the multiprocessing setting which I can add a patch quickly, but would be good to confirm which part causes the error.
from higashi.
![image](https://private-user-images.githubusercontent.com/29166540/237470210-3d8e7555-03f2-45d6-b1e1-64092d4ca90e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE4MTgyMjMsIm5iZiI6MTcyMTgxNzkyMywicGF0aCI6Ii8yOTE2NjU0MC8yMzc0NzAyMTAtM2Q4ZTc1NTUtMDNmMi00NWQ2LWIxZTEtNjQwOTJkNGNhOTBlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI0VDEwNDUyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTM3Zjg3OTVjY2ExOGNiMzY5ODIxZDk3NWM3OTBhODA4MmI5ZTU5Mzk3NDQ0NTRhYjdlZjU3NGQzYTcyZDE3Y2YmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.MUWssq5HoecrpQDpr7SsRq6J364Wxa7QJN71Nue6r10)
![image](https://private-user-images.githubusercontent.com/29166540/237470243-1cbf8bd5-70ef-4541-88f8-a3e10014f38e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE4MTgyMjMsIm5iZiI6MTcyMTgxNzkyMywicGF0aCI6Ii8yOTE2NjU0MC8yMzc0NzAyNDMtMWNiZjhiZDUtNzBlZi00NTQxLTg4ZjgtYTNlMTAwMTRmMzhlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI0VDEwNDUyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTliMDMyZmFmYTcyNTgzNWJhZDQ1ZTdjOGQ4NTJiODNlYzRmYTM2NTJmNzkwOThiNmMzY2NkY2EyZGEwNDMxNWQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.C9T9ASWLo5JpXj0LMYWpyRrWP26puLCcUGtrChbRBYE)
from higashi.
Hey I just commit a patch for this, could you check to see if that fixes the error. So, after you pull the repo, update the package, for higashi_model.process_data()
could you do higashi_model.process_data(disable_mpl=True)
instead, to disable multiprocessing for creating sparse matrices and see if that fixes the bug. Thanks
from higashi.
Thank you for your help! unfortunately, I think it's still getting stuck even after I git pulled and set disable_mpl=True
from higashi.
Hum.. Can you kill it and attach the full error information. I think now it would show what the actual part it stuck on
from higashi.
Actually I tried rerunning it and it worked this time!! thank you so much for your help!!
from higashi.
Oh... Cool. So it is the multiprocessing stuff.
from higashi.
@ruochiz Hello! I try to use higashi_model.process_data(disable_mpl=True)
to solve the problem,
but I got an error TypeError: process_data() got an unexpected keyword argument 'disable_mpl'
my higashi version is higashi 0.1.1a1 py_0
What can I do to solve this problem?
from higashi.
@ruochiz Hello! I try to use
higashi_model.process_data(disable_mpl=True)
to solve the problem, but I got an errorTypeError: process_data() got an unexpected keyword argument 'disable_mpl'
my higashi version is higashi 0.1.1a1 py_0 What can I do to solve this problem?
I changed the default disable_mpl value in Process.py, but it didn't work
from def create_matrix(config, disable_mpl=False)
to def create_matrix(config, disable_mpl=True)
from higashi.
Could you try to run it with just higashi_model.process_data()
see if that fixes the issue? Thx!
from higashi.
It didn't work, should I change the default disable_mpl value in Process.py back to False
?
(different datasets have different results. In another dataset, it worked but very slowly)
I found the Process.py in higashi 0.1.1a1 py_0 doesn't have the parameter disable_mpl, I download Process.py from GitHub to cover the local Process.py and then change the default value.
from higashi.
Right. The disable_mpl options would disabled multiprocessing, and only use one thread to run the program, hence it's expected to be slow. It's mainly used for debugging purpose (and not included in any release version). I added this option because error in child processes sometimes are not printed in the main program.
Usually the error of multiprocessing stucks happens on windows machine.
It's suggested that by default, one uses the program with mpl (so disable_mpl=False). Did you find the code stucked at the processing stage?
from higashi.
Yes, the code is stuck at the processing stage higashi_model.process_data()
, I run the process_data() 50 minutes ago, but only finished 5%. (GPU number: 5)
from higashi.
Hum. The process_data does not use gpu. The creating matric tasks can be slow depending on the parameters. What's the number of cells, resolution of the contact map, and did you end up using disable_mpl
or not?
And, when the code is running, could you check htop or top to see if Higashi is using multiple cores efficiently or not? Thanks!
from higashi.
I run the Higashi on the GPU node. There are about 7000 cells, and the resolution is 50k. The value of parameter disable_mpl
was set False (I checked Higashi was using multiple cores). Maybe the resolution is too small?
from higashi.
These all sounds like reasonable parameters. 50k is a bit small, but Higashi should be able to handle that.
It does seem like there are only 3,103,646 contacts for 7k cells, so on average each cell gets ~360 non-zero entries, which is really small.
For comparisons, one of the sparser scHiC dataset, the 4DN sciHi-C dataset has ~103,497,337 for 16,707 cells, and are usually processed at 500kb or 1Mb resolution.
Just for debugging purpose, I tested to process 4DN sciHi-C dataset at 50kb on a GPU node with about 32 CPU threads, the running speed is roughly like this (the config name says 500k, but I changed it to 50kb as shown):
For troubleshooting, if you are comfortable, you can share a fraction of the dataset that can replicate this behavior here: [email protected] and I can help to take a look later.
from higashi.
Relate data has been sent to you. Thanks for your patience!
from higashi.
Related Issues (20)
- Error running Ramani data HOT 2
- higashi.Higashi_backend.Modules import error HOT 5
- error when running scTAD.py HOT 1
- Error running simulated data
- The main_cell.py is so slow HOT 5
- Problem running Higashi on Ramani et al. HOT 5
- What are the configure options mean?
- Stop with OSError when run "higashi_model.train_for_imputation_nbr_0()" HOT 3
- Error in fh_model.prep_dataset() "Pack from sparse mtx to tensors" HOT 2
- ERROE when run process.py: no config file HOT 1
- Predicting structures from embedding vector HOT 2
- wrapper.fast_process_data() - method does not exist HOT 2
- ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (15361,) + inhomogeneous part. HOT 3
- RuntimeError: received 0 items of ancdata
- Higashi stuck on training at higashi_model.train_for_imputation_nbr_0() on SLURM system HOT 7
- ValueError: setting an array element with a sequence. HOT 1
- RuntimeError: CUDA out of memory.
- The Dip-C data processing keeps encountering errors. HOT 3
- how the cell_name in data.txt corresponds to the cell_type in label_info.pickle? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from higashi.