GithubHelp home page GithubHelp logo

Comments (33)

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao,

'Multithreaded' is intended for small-scale applications in which each solver instance runs on a single core. Currently it is not possible to use it with solver executables that require more than core.

That said, it might be possible to add this functionality.

Could you describe what system/cluster you are running on? What workflow are you carrying out (inversion, migration, ...)?

Ryan

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

UPDATE: Actually, I believe there is a way to add this functionality by modifying only a single line. Since it's such a simple change, I'll go ahead and submit a pull request.

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

UPDATE: I think it should be ready to go now.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,

I also think that we can use the NPROC>1, if that, we can use the mpiexec under 'Multithreaded' system.
However, i test your latest version today and find that there is a bug when running the 2D checkers example. Actually, i find this error occurs when using "combine module" to Sums individual kernel in base.py (solver).
error1
error2

maybe, it's the reason when using xcombine_sem, i'm not sure, can you check that?
error3

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

attached is the parameter.py
parameters.txt

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, From the traceback it looks like an issue with the utility for smoothing kernels. (To double check, you could try running with SMOOTH=False.)

As a workaround I would suggest commenting out completely the 'solver.specfem2d.smooth' method so that the parent class method 'solver.base.smooth' which uses the SPECFEM xcombine_sem utility will be invoked instead. Does that make sense?

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,make sense. Actually, i have tried set SMOOTH=False, the following bug occurs:
error5

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, Very useful, I'm not sure what's going on with this traceback actually, some type of regression? Let me take a look.

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Looks like our cluster here is having issues, I don't think I can debug it immediately, but I will as soon as it is back on.

In the meantime, could you remind me

  1. the size of your model
  2. the number and types of material parameters
  3. what system/cluster you are running on including the number of cores available and the memory per node?

EDIT:
4) also how many cores per solver instance were you using for the last traceback?

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

I just use the 2D checkers example:
attached is parameters.py
parameters.txt

  1. checkboard model: http://tigress-web.princeton.edu/~rmodrak/2dAcoustic/checkers/
    2οΌ‰just use vs
  2. just use my PC computer, not cluster

BTW, in the bug.log, there is a Warning: mesh_properties.nproc != PAR.NPROC, because now, mesh_properties.nproc=1,PAR.NPROC=4, does this cause the problem?

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, Did you remesh the model? To change the number of processors from 1 to 4, you would need to generate a new numerical mesh via xmeshfem2D.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

I also used the model provided in your examples. You means that if i want to change the number of processors from 1 to 4, i need to remesh the model using xmeshfem2d, that is mpiexec -n 4 ./xmeshfem2d , right?

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

That's right, youd need to remesh and supply a new model. You can find information on this, I believe, on the SPECFEM2D manual or issues page. Good luck!

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, You need to create a new model in the form SPECFEM2D is able to
read and write. Probably it would be good to start by familiarizing
yourself with SPECFEM2D. The manual is a good place to start, and the
issues page can be useful if you run into any trouble.

On Mon, Aug 29, 2016 at 12:07 PM, CHAO ZHANG [email protected]
wrote:

I also used the model provided in your examples. You means that if i want
to change the number of processors from 1 to 4, i need to remesh the model
using xmeshfem2d, that is mpiexec -n 4 ./xmeshfem2d , right?

β€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#38 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AERpSv1mql46wQVZuUi-hwO4aWUmaH77ks5qkwPTgaJpZM4JqmXP
.

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

If it's alright I'll go ahead and close soon.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi, Ryan
Sorry, i also meet with the bugs, even if i remeshed the model, like this
error6

error7
Can you try this 2D checkboard test on your computer, it's weired
Thanks

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, MPI parallelization is working fine the 3D case so I'm not sure what's wrong in the 2D case. Perhaps check the xcombine_sem for bugs (SPECFEM2D has never been a funded project so unfortunately there are bugs ). Also, check the xcombine_sem is being invoked with the proper mpiexec wrapper by overloading system.mpiexec if necessary.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

so, you also meet with the same problem in your computer, right? Buy the way, except system='multithreaded', if i want to use mpiexec, can i use the system='mpi'? For this case, can i set
nproc>1?

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

I mean that all the examples in your guide for 2D, are using nproc=1, do you have some examples which using nproc>1. I want if it is set as nproc=1, how can you use the mpiexec
(mpiexec -n nproc ./xmeshfem2d)?

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao, It might help to step back a bit first. You're running an inversion and you want each individual solver instance to run on multiple cores. In 3D this is currently working well for us. Such an approach is not currently implemented in 2D, but it should be fairly straightforward if you are familiar with SPECFEM2D and seisflows.

But let me ask, why do you want to do this for 2D? If your 2D model is so large that you can't fit as many copies of it in the memory available on a single node as you have processors available on that node, then it makes sense to have each solver instance run on multiple cores. If not, I can't think of any significant advantage in terms of speed or efficiency.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,

Actually, i just want to realize that: for each source, we can use parallel ( mpiexec -np nproc ./xspecfem).

I always use the system='MULTITHREADED', and it allows embarrassingly parallel tasks to be carried out several at a time. So, it means that for each task, we still use serial, rather that parallel. Actually, in the script: serial.py, there is a choice:
22
so i figure that you provide a choice for us using (mpiexec -n nproc ./xspecfem).

At the same time, i check out the mpi.py,
222

so i am very puzzled that how you can realize parallel for each task, either the system='MULTITHREADED' or system='MPI'. In your examples for 2D, NPROC is always set 1, in my mind, it will be serial, rather than parallel for each task, right?

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Good question, let me explain the naming convention.

The names of modules in seisflows/system reflects how parallelization over shots is implemented. For example, system/serial means that shots are carried out one at a time. system/multithreaded means that as many shots are run at a single time as allowed by the available number of processors.

There is no connection here to whether or not individual solver instances run in a parallel, only to how parallelization over shots is handled.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi, Ryan
Thanks, it's helpful for my comprehension

  1. so all of the module in seisflows/system just reflects how parallelization over shots is implemented, right?
  2. if we consider each shot, we can use serial or parrell. That is, NPROC will determine it.
    if NPROC=1, the ./xspecfem2d will be invoked; if NPROC>1, the mpirun -np NPROC ./xspecfem2d will be invoked. As you know, i failed when using the case of NPROC>1 under the sysetem='MULTITHREADED'.

so, do you try the NPROC>1 (parallel for each task) in one of the module in seisflows/system?
Because i just see the example (NPROC=1) for 2d case

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024
  1. correct

  2. correct

  3. we have used NPROC>1 routinely for 3D but not for 2D

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,

Make sense, thank for your help

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hope it was helpful. Good luck!

On Sep 11, 2016, at 10:34 AM, CHAO ZHANG [email protected] wrote:

Hi Ryan,

Make sense, thank for your help

β€”
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,

I can use the nproc>1 under system='MULTITHREADED', actually, i find there are two places we need to
revise if we want to realize that.
1) xcombine_sem
in the examples, we use the specfem2d-d745c542 ( as you said last time, this version has some bug when using xcombine_sem)), that is, no matter you use the (./xcombine_sem) or (mpirun -n 4 ./xcombine_sem),
there is one processor kernel generated, proc000000_vs_kernel.bin, so even though we use mpirun -n 4 ./xcombine_sem, it only generates the proc000000_vs_kernel.bin ( no proc000001_vs_kernel.bin, proc000002_vs_kernel.bin,proc000003_vs_kernel.bin). So i download the latest version of specfem2d, when using mpirun -n 4 ./xcombine_sem, i used the new version instead of the old version, after that, i can get four profile of kernel, then it can work

2) smooth
in the seisflows, you use the function
screenshot from 2016-09-14 04 17 56

so if we want to use the nproc>1, i revise it like this:
screenshot from 2016-09-14 04 22 31

ater there two revision, the code can run parallel for each task

"Hi Chao, MPI parallelization is working fine the 3D case so I'm not sure what's wrong in the 2D case. Perhaps check the xcombine_sem for bugs (SPECFEM2D has never been a funded project so unfortunately there are bugs ). Also, check the xcombine_sem is being invoked with the proper mpiexec wrapper by overloading system.mpiexec if necessary."

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

However, after the smooth, i find that there is some thing strange
After (mpirun -n nproc ./xcombine_sem), i get the kernel like this
nosmooth
After (smooth), i get the kernel like this
sum

It seems that the smooth has obvious distort between the interface of each processor mesh, so does this
means the smooth method is incorrect? can you give some suggestions

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

Hi Chao,

Thank you for identifying these issues, which arise when the 2D kernel summation and smoothing routines are used with MPI models. I'm unable to address these issues now myself because my PhD defense is within a few weeks. If you wanted to look into it yourself, it would be a matter of debugging and fixing SPECFEM2D's xcombine_sem and xsmooth_sem utilities.

If you want to, feel free to open a new issue either here or in the SPECFEM2D issues page, something along the lines of "SPECFEM2D's xcombine_sem and xsmooth_sem not working for MPI models".

Thanks,
Ryan

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan,
Thank for your suggestion, if the xcombine_sem and xsmooth_sem can be solved well, i think we can use the MPI. BTW, in your GJI paper, i see that you run some 2D synthetic data, so you use the serial instead of
parallel for each task, right? Because i think you maybe also meet this problem if using ( mpirun -np nproc
xcombine_sem and xsmooth_sem)
Wish you have a good time for PhD defense

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

As I was trying to explain on the September 1 post, I'm a little confused about your ultimate goals. If you're running on a cluster, 2D inversions are quite fast even with one core per solver instance--unless your 2D model is huge, there is no need to parallelize over model regions. On the other hand, if you're running on a desktop or laptop, the number of cores is the limiting factor so you'll likely see no speedup by parallelizing over model region.

To answer your question anyway though, I ran those 2D experiments on a cluster, so I used the slurm_sm option for "small" SLURM inversions.

from seisflows.

dkzhangchao avatar dkzhangchao commented on June 27, 2024

Hi Ryan
ok, let me specify it, i am running on a desktop ( which has 16 processor), i used the system='multithreaded' in parameter.py, under this case because of the nproc=1, it will evoke serial solver(./xmeshfem2d and ./xspecfem2d), i find it's very slow in running especially when the frequency of source is too high.

so i think can we set the nproc>1, then it will evoke parallel solver (mpirun -n nproc ./xmeshfem2d and mpirun -n nproc ./xspecfem2d) for each task, so it will run faster when running. Does this make sense?

from seisflows.

rmodrak avatar rmodrak commented on June 27, 2024

If you're doing an inversion with only 16 cores, using more cores per solver instance doesn't get you much because it limits the number of shots you can run simultaneously. That's all I was trying to say.

from seisflows.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.