Comments (33)
Hi Chao,
'Multithreaded' is intended for small-scale applications in which each solver instance runs on a single core. Currently it is not possible to use it with solver executables that require more than core.
That said, it might be possible to add this functionality.
Could you describe what system/cluster you are running on? What workflow are you carrying out (inversion, migration, ...)?
Ryan
from seisflows.
UPDATE: Actually, I believe there is a way to add this functionality by modifying only a single line. Since it's such a simple change, I'll go ahead and submit a pull request.
from seisflows.
UPDATE: I think it should be ready to go now.
from seisflows.
Hi Ryan,
I also think that we can use the NPROC>1, if that, we can use the mpiexec under 'Multithreaded' system.
However, i test your latest version today and find that there is a bug when running the 2D checkers example. Actually, i find this error occurs when using "combine module" to Sums individual kernel in base.py (solver).
maybe, it's the reason when using xcombine_sem, i'm not sure, can you check that?
from seisflows.
attached is the parameter.py
parameters.txt
from seisflows.
Hi Chao, From the traceback it looks like an issue with the utility for smoothing kernels. (To double check, you could try running with SMOOTH=False.)
As a workaround I would suggest commenting out completely the 'solver.specfem2d.smooth' method so that the parent class method 'solver.base.smooth' which uses the SPECFEM xcombine_sem utility will be invoked instead. Does that make sense?
from seisflows.
Hi Ryan,make sense. Actually, i have tried set SMOOTH=False, the following bug occurs:
from seisflows.
Hi Chao, Very useful, I'm not sure what's going on with this traceback actually, some type of regression? Let me take a look.
from seisflows.
Looks like our cluster here is having issues, I don't think I can debug it immediately, but I will as soon as it is back on.
In the meantime, could you remind me
- the size of your model
- the number and types of material parameters
- what system/cluster you are running on including the number of cores available and the memory per node?
EDIT:
4) also how many cores per solver instance were you using for the last traceback?
from seisflows.
I just use the 2D checkers example:
attached is parameters.py
parameters.txt
- checkboard model: http://tigress-web.princeton.edu/~rmodrak/2dAcoustic/checkers/
2οΌjust use vs - just use my PC computer, not cluster
BTW, in the bug.log, there is a Warning: mesh_properties.nproc != PAR.NPROC, because now, mesh_properties.nproc=1,PAR.NPROC=4, does this cause the problem?
from seisflows.
Hi Chao, Did you remesh the model? To change the number of processors from 1 to 4, you would need to generate a new numerical mesh via xmeshfem2D.
from seisflows.
I also used the model provided in your examples. You means that if i want to change the number of processors from 1 to 4, i need to remesh the model using xmeshfem2d, that is mpiexec -n 4 ./xmeshfem2d , right?
from seisflows.
That's right, youd need to remesh and supply a new model. You can find information on this, I believe, on the SPECFEM2D manual or issues page. Good luck!
from seisflows.
Hi Chao, You need to create a new model in the form SPECFEM2D is able to
read and write. Probably it would be good to start by familiarizing
yourself with SPECFEM2D. The manual is a good place to start, and the
issues page can be useful if you run into any trouble.
On Mon, Aug 29, 2016 at 12:07 PM, CHAO ZHANG [email protected]
wrote:
I also used the model provided in your examples. You means that if i want
to change the number of processors from 1 to 4, i need to remesh the model
using xmeshfem2d, that is mpiexec -n 4 ./xmeshfem2d , right?β
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#38 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AERpSv1mql46wQVZuUi-hwO4aWUmaH77ks5qkwPTgaJpZM4JqmXP
.
from seisflows.
If it's alright I'll go ahead and close soon.
from seisflows.
Hi, Ryan
Sorry, i also meet with the bugs, even if i remeshed the model, like this
Can you try this 2D checkboard test on your computer, it's weired
Thanks
from seisflows.
Hi Chao, MPI parallelization is working fine the 3D case so I'm not sure what's wrong in the 2D case. Perhaps check the xcombine_sem for bugs (SPECFEM2D has never been a funded project so unfortunately there are bugs ). Also, check the xcombine_sem is being invoked with the proper mpiexec wrapper by overloading system.mpiexec if necessary.
from seisflows.
so, you also meet with the same problem in your computer, right? Buy the way, except system='multithreaded', if i want to use mpiexec, can i use the system='mpi'? For this case, can i set
nproc>1?
from seisflows.
I mean that all the examples in your guide for 2D, are using nproc=1, do you have some examples which using nproc>1. I want if it is set as nproc=1, how can you use the mpiexec
(mpiexec -n nproc ./xmeshfem2d)?
from seisflows.
Hi Chao, It might help to step back a bit first. You're running an inversion and you want each individual solver instance to run on multiple cores. In 3D this is currently working well for us. Such an approach is not currently implemented in 2D, but it should be fairly straightforward if you are familiar with SPECFEM2D and seisflows.
But let me ask, why do you want to do this for 2D? If your 2D model is so large that you can't fit as many copies of it in the memory available on a single node as you have processors available on that node, then it makes sense to have each solver instance run on multiple cores. If not, I can't think of any significant advantage in terms of speed or efficiency.
from seisflows.
Hi Ryan,
Actually, i just want to realize that: for each source, we can use parallel ( mpiexec -np nproc ./xspecfem).
I always use the system='MULTITHREADED', and it allows embarrassingly parallel tasks to be carried out several at a time. So, it means that for each task, we still use serial, rather that parallel. Actually, in the script: serial.py, there is a choice:
so i figure that you provide a choice for us using (mpiexec -n nproc ./xspecfem).
At the same time, i check out the mpi.py,
so i am very puzzled that how you can realize parallel for each task, either the system='MULTITHREADED' or system='MPI'. In your examples for 2D, NPROC is always set 1, in my mind, it will be serial, rather than parallel for each task, right?
from seisflows.
Good question, let me explain the naming convention.
The names of modules in seisflows/system reflects how parallelization over shots is implemented. For example, system/serial means that shots are carried out one at a time. system/multithreaded means that as many shots are run at a single time as allowed by the available number of processors.
There is no connection here to whether or not individual solver instances run in a parallel, only to how parallelization over shots is handled.
from seisflows.
Hi, Ryan
Thanks, it's helpful for my comprehension
- so all of the module in seisflows/system just reflects how parallelization over shots is implemented, right?
- if we consider each shot, we can use serial or parrell. That is, NPROC will determine it.
if NPROC=1, the ./xspecfem2d will be invoked; if NPROC>1, the mpirun -np NPROC ./xspecfem2d will be invoked. As you know, i failed when using the case of NPROC>1 under the sysetem='MULTITHREADED'.
so, do you try the NPROC>1 (parallel for each task) in one of the module in seisflows/system?
Because i just see the example (NPROC=1) for 2d case
from seisflows.
-
correct
-
correct
-
we have used NPROC>1 routinely for 3D but not for 2D
from seisflows.
Hi Ryan,
Make sense, thank for your help
from seisflows.
Hope it was helpful. Good luck!
On Sep 11, 2016, at 10:34 AM, CHAO ZHANG [email protected] wrote:
Hi Ryan,
Make sense, thank for your help
β
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
from seisflows.
Hi Ryan,
I can use the nproc>1 under system='MULTITHREADED', actually, i find there are two places we need to
revise if we want to realize that.
1) xcombine_sem
in the examples, we use the specfem2d-d745c542 ( as you said last time, this version has some bug when using xcombine_sem)), that is, no matter you use the (./xcombine_sem) or (mpirun -n 4 ./xcombine_sem),
there is one processor kernel generated, proc000000_vs_kernel.bin, so even though we use mpirun -n 4 ./xcombine_sem, it only generates the proc000000_vs_kernel.bin ( no proc000001_vs_kernel.bin, proc000002_vs_kernel.bin,proc000003_vs_kernel.bin). So i download the latest version of specfem2d, when using mpirun -n 4 ./xcombine_sem, i used the new version instead of the old version, after that, i can get four profile of kernel, then it can work
2) smooth
in the seisflows, you use the function
so if we want to use the nproc>1, i revise it like this:
ater there two revision, the code can run parallel for each task
"Hi Chao, MPI parallelization is working fine the 3D case so I'm not sure what's wrong in the 2D case. Perhaps check the xcombine_sem for bugs (SPECFEM2D has never been a funded project so unfortunately there are bugs ). Also, check the xcombine_sem is being invoked with the proper mpiexec wrapper by overloading system.mpiexec if necessary."
from seisflows.
However, after the smooth, i find that there is some thing strange
After (mpirun -n nproc ./xcombine_sem), i get the kernel like this
After (smooth), i get the kernel like this
It seems that the smooth has obvious distort between the interface of each processor mesh, so does this
means the smooth method is incorrect? can you give some suggestions
from seisflows.
Hi Chao,
Thank you for identifying these issues, which arise when the 2D kernel summation and smoothing routines are used with MPI models. I'm unable to address these issues now myself because my PhD defense is within a few weeks. If you wanted to look into it yourself, it would be a matter of debugging and fixing SPECFEM2D's xcombine_sem and xsmooth_sem utilities.
If you want to, feel free to open a new issue either here or in the SPECFEM2D issues page, something along the lines of "SPECFEM2D's xcombine_sem and xsmooth_sem not working for MPI models".
Thanks,
Ryan
from seisflows.
Hi Ryan,
Thank for your suggestion, if the xcombine_sem and xsmooth_sem can be solved well, i think we can use the MPI. BTW, in your GJI paper, i see that you run some 2D synthetic data, so you use the serial instead of
parallel for each task, right? Because i think you maybe also meet this problem if using ( mpirun -np nproc
xcombine_sem and xsmooth_sem)
Wish you have a good time for PhD defense
from seisflows.
As I was trying to explain on the September 1 post, I'm a little confused about your ultimate goals. If you're running on a cluster, 2D inversions are quite fast even with one core per solver instance--unless your 2D model is huge, there is no need to parallelize over model regions. On the other hand, if you're running on a desktop or laptop, the number of cores is the limiting factor so you'll likely see no speedup by parallelizing over model region.
To answer your question anyway though, I ran those 2D experiments on a cluster, so I used the slurm_sm
option for "small" SLURM inversions.
from seisflows.
Hi Ryan
ok, let me specify it, i am running on a desktop ( which has 16 processor), i used the system='multithreaded' in parameter.py, under this case because of the nproc=1, it will evoke serial solver(./xmeshfem2d and ./xspecfem2d), i find it's very slow in running especially when the frequency of source is too high.
so i think can we set the nproc>1, then it will evoke parallel solver (mpirun -n nproc ./xmeshfem2d and mpirun -n nproc ./xspecfem2d) for each task, so it will run faster when running. Does this make sense?
from seisflows.
If you're doing an inversion with only 16 cores, using more cores per solver instance doesn't get you much because it limits the number of shots you can run simultaneously. That's all I was trying to say.
from seisflows.
Related Issues (20)
- plot2d can't set a specified cmap HOT 1
- IndexError while running an acoustic example HOT 6
- error in creating parameter file HOT 3
- 'DATA_CASE' not found in parameters.yaml HOT 4
- Error at 'postprocess_event_kernels' stage HOT 2
- methodology for line search inversion using the gradient HOT 2
- create and populate an examples directory
- Need some help with using seisflows in Cluster HOT 4
- NPROC > 1 not working HOT 10
- Documentation update planning
- System cluster problem: "ModuleNotFoundError: No module named 'seisflows'" HOT 2
- system parameter ntask_max is not honored for certain subclasses
- DATA_CASE' not found in parameters.yaml HOT 4
- add support for SPECFEM2D acoustic domain
- Example 2 fails to run HOT 8
- potential race condition prevents 'unix.rm' from deleting directory HOT 2
- Have some problems when trying to create an Example for Seisflows based on Marmousi data. HOT 8
- SIGTRAP & SIGFILL Errors HOT 3
- Issue with adjoint in the Inversion Workflow HOT 10
- Issue with running example 1 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seisflows.