GithubHelp home page GithubHelp logo

5enxia / parallel-krylov Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 2.0 4.77 MB

Krylov Subspace Method Modules made by Ikuno labolatory in Tokyo University of Technology

License: MIT License

Python 100.00%
krylov-subspace-methods cuda mpi numpy cupy

parallel-krylov's Introduction

krylov

Krylov subspace methods module.

Methods

  • CG
  • MrR
  • k-skip CG
  • k-skip MrR
  • Adaptive k-skip MrR

directories

  • v3
    • cpu
      • mpi
        • cg
        • mrr
        • kskipcg
        • kskipmrr
        • adaptivekskipmrr
      • cg
      • mrr
      • kskipcg
      • kskipmrr
      • adaptivekskipmrr
    • gpu
      • mpi
        • cg
        • mrr
        • kskipcg
        • kskipmrr
        • adaptivekskipmrr
      • cg
      • mrr
      • kskipcg
      • kskipmrr
      • adaptivekskipmrr

requirements

C libs

only CPU

  • C Compiler
    • GCC
    • Intel C Compiler
    • etc...
  • BLAS library

with GPU

  • CUDA(10.1)

with MPI

Pyhton3 modules

only CPU

with GPU

with MPI

only exec with cuda and mpiexec.hydra(Intel MPI)

  • fastrlock

settings

on macOS(10.14.6)

export below param

export PMIX_MCA_gds=hash

with cupy

expand mermory allocator limit

pool = cp.cuda.MemoryPool(cp.cuda.malloc_managed)
cp.cuda.set_allocator(pool.malloc)

parallel-krylov's People

Contributors

5enxia avatar

Stargazers

 avatar

Watchers

 avatar  avatar

parallel-krylov's Issues

Doesn't converged, k-skip MrR with GPU + MPI

Fri Dec 24 17:12:03 JST 2021
ESC[32m# ================ INFO ================ #ESC[0m
Method: k-skip MrR + gpu + mpi
Initial_k: 0
Time: 0.5320529937744141 s
Status: diverged
Iteration: 112 times
Final_Residual: 0.2209790717824514
ESC[32m# ====================================== #ESC[0m

InfiniBand cache error

[scb0115:mpi_rank_0][MPIDI_CH3_Init]

Please set LD_PRELOAD to the full path of libmpi.so of your MVAPICH2-GDR installation to avoid unexpected errors for GPU-based transfers

E.g. LD_PRELOAD=<PATH_TO_MVAPICH2_GDR_INSTALL>/lib64/libmpi.so

WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing without InfiniBand registration cache support.

The output of the MPI adaptive k skip MrR method does not match a single version.

single

============== INFO =================

Method: adaptive k-skip MrR
k: 5
time: 0.013791053999739233 s
status: converged
iteration: 6 times
residual: 3.4114063610239287e-09
final k: 5

=====================================

MPI

============== INFO =================

Method: adaptive k-skip MrR
k: 5
time: 0.130841 s
status: converged
iteration: 7 times
residual: 2.9166983973371867e-10
final k: 5

=====================================

Calculation result becomes strange at the time of asynchronous execution with memcpyPeerAsync()

def dot(cls, local_A, x, out):
        # Copy vector data to All devices
        for i in range(cls.begin, cls.end+1):
            index = i-cls.begin
            cp.cuda.runtime.memcpyPeerAsync(cls.x[index].data.ptr, i, x.data.ptr, cls.end, cls.nbytes, cls.streams[index].ptr)
        # dot
        for i in range(cls.begin, cls.end+1):
            index = i-cls.begin
            Device(i).use()
            cls.streams[index].synchronize()
            cls.y[index] = cls.A[index].dot(cls.x[index])
        # Gather caculated element from All devices
        for i in range(cls.begin, cls.end+1):
            Device(i).synchronize()
            index = i-cls.begin
            cp.cuda.runtime.memcpyPeerAsync(cls.out[index*cls.local_local_N].data.ptr, cls.end, cls.y[index].data.ptr, i, cls.local_local_nbytes, cls.streams[index].ptr)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.