karlnapf / independent-jobs Goto Github PK
View Code? Open in Web Editor NEWPython framework for independent computation with backends for batch clusters
License: Other
Python framework for independent computation with backends for batch clusters
License: Other
I am using engine = SlurmComputationEngine(..)
to submit jobs. Then at the end, I do engine.wait_for_all()
. When the number of jobs is large, the program execution never passes the line engine.wait_for_all()
. I can confirm that all the jobs are done. I suspect that the engine somehow thinks that some jobs are lost and resubmits them (?). Using commit 06ed160a9
on 12 May 2017.
According to the log, probably some of the submitted jobs were killed by Slurm. The computation engine seems to have submitted again. But then, I am not sure why I got
AttributeError: 'SlurmComputationEngine' object has no attribute 'all'
Full log
INFO: 2015-12-09 21:56:03,634: BatchClusterComputationEngine._wait_until_n_unfinished(): Waiting for e1_8b4b6ea0-b273-47df-83ed-807
a1b783db2 and 219 other jobs
INFO: 2015-12-09 21:56:29,261: BatchClusterComputationEngine._wait_until_n_unfinished(): e1_8b4b6ea0-b273-47df-83ed-807a1b783db2 ex
ceeded maximum waiting time of 3540h
INFO: 2015-12-09 21:56:29,261: BatchClusterComputationEngine._resubmit(): Re-submitting under name e1_00dbeb1f-91d9-4ba2-b2dd-d234e
8993689
Traceback (most recent call last):
File "freqopttest/ex/ex1_power_vs_n.py", line 199, in <module>
main()
File "freqopttest/ex/ex1_power_vs_n.py", line 168, in main
engine.wait_for_all()
File "/nfs/nhome/live/wittawat/git/independent-jobs/independent_jobs/engines/BatchClusterComputationEngine.py", line 279, in wait
_for_all
self._wait_until_n_unfinished(0)
File "/nfs/nhome/live/wittawat/git/independent-jobs/independent_jobs/engines/BatchClusterComputationEngine.py", line 274, in _wai
t_until_n_unfinished
self._resubmit(job_name)
File "/nfs/nhome/live/wittawat/git/independent-jobs/independent_jobs/engines/BatchClusterComputationEngine.py", line 223, in _res
ubmit
for i in range(len(self.all)):
AttributeError: 'SlurmComputationEngine' object has no attribute 'all'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.