hpac / lammps-tersoff-vector Goto Github PK

A Vectorized Implementation of the Tersoff Potential for the LAMMPS Molecular Dynamics Software

C++ 59.92% Python 3.60% Shell 1.31% Tcl 8.78% C 1.54% TeX 0.13% HTML 15.10% Fortran 3.10% Makefile 0.21% AMPL 0.03% Gnuplot 0.01% GAP 0.01% Perl 0.09% Cuda 2.35% CMake 0.07% Emacs Lisp 0.01% Arc 0.12% MATLAB 0.05% Awk 0.01% Roff 3.60%

lammps-tersoff-vector's Introduction

A Vectorized Implementation of the Tersoff Potential 
    for the LAMMPS Molecular Dynamics Software
====================================================

Author: Markus Höhnerbach <[email protected]>
Date:  4 Aug 2016

This project provides the source code of a vectorized
implementation of the Tersoff potential.
We target a variety of processors with conventional vector
instruction sets such as NEON, SSE, AVX, and AVX2, the first 
and second generation of the Xeon Phi accelerator, as well 
as NVIDIA GPUs.
There is experimental support for platform-agnostic 
vectorization through the Cilk array notation.

Supported compilers: ICC 14.0, 15.0 or 16.0, GCC (ARM)
Supported MPI: Intel MPI

The code builts upon the existing Xeon Phi support and
vectorization capabilities of the USER-INTEL LAMMPS
package as well as the GPU support from the KOKKOS package.

Overview
--------

benchmarks/
  vect/
    very simple benchmark to measure vect. efficiency.
  lammps/
    input files, parameter files and scripts to conduct
    benchmarking and accuracy tests. Subfolders contain
    results from real-world systems.
machines/
  lammps-10Mar16/
    complete lammps source code that is certain to work
    with the provided source code.
  <a>-<b>_<c>/
    folder to build lammps on a specific system. Names:
    a = organization, b = CPU arch, c = accelerator.
    These folders contain a build.sh script that shows
    how to build binaries to experiment with on a given
    system.
src/
  The core source code that contains the vectorized
  Tersoff potential. Can be dropped into an existing
  LAMMPS install with USER-INTEL package installed,
  and should just work.
test/
  Contains a script to test the code against bothh the
  benchmark and randomly generated systems of multiple
  species. Invoke the python script with the binary
  that you would like to test. For now only works with
  the USER-INTEL package.

Installation (simple)
---------------------

To try this code out, download LAMMPS from lammps.sandia.gov,
and extract the files to some directory $LAMMPS_DIR.
In the following, $THIS denotes the directory where this
README is located.
You need to enable the packages MANYBODY, USER-OMP and USER-INTEL:

$ cd $LAMMPS_DIR/src
$ make yes-MANYBODY yes-USER-OMP yes-USER-INTEL

Copy the files pair_tersoff_intel.h, pair_tersoff_intel.cpp
and intel_intrinsics.h from $THIS/src/ to $LAMMPS_DIR/src.

Build LAMMPS (make sure to have ICC with offloading support
and Intel MPI loaded):

$ make intel_phi

This creates a binary $LAMMPS_DIR/src/lmp_intel_phi.

Testing (simple)
----------------

To test this binary, use the provided test-script:

$ cd $THIS/test
$ python test.py $LAMMPS_DIR/src/lmp_intel_phi

All the tests should turn green.

Usage
-----

For further usage instructions, please have a look at
the documentation of the USER-INTEL package.
The code neatly plugs into that framework, all you need
to do is
1. specify the correct "package intel" command according
   to the USER-INTEL docs, to initialize the correct usage
   mode.
2. use the Tersoff potential and set the suffix to "intel"

Getting Started
---------------

If you just want to try out the code and make some
obvservations on its performance, the easiest way to do so
is to download the LAMMPS-provided benchmark for the Tersoff
potential, and pass the correct options via the command line.

$ http://lammps.sandia.gov/bench/bench_tersoff.tar.gz
$ tar xfz bench_tersoff.tar.gz
$ cd tersoff
$ $LAMMPS_DIR/src/lmp_intel_phi -in in.tersoff -pk omp 0 \
-pk intel 1 balance $BALANCE mode $MODE -sf intel

1. Choose $MODE as either single, double or mixed depending 
   on the precision you want the run to use.
2. Choose $BALANCE according to where you want to run:
   0 runs everything on the host, 1 everything on the Phi,
   values in between split the computation. -1 will perform
   automatic load balancing.

In-Depth Benchmarking
---------------------

For in-depth benchmarking, build all the binaries that you
would like to investigate (machines/*/build.sh show how to
build a variety of targets).
For single-node benchmarking, benchmarks/lammps contains
shell scripts to conduct a number of experiments.
For multi-node benchmarking, machines/lrz-ib_phi contains
a python script to showcase how to create job-scripts to
be submitted to a batch system.
If you can't run the code on suitable machines, check out
the result folders, i.e. benchmarks/lammps/results* and
machines/lrz-ib_phi/run*, as they contain real-world data
from a selection of machines.

Limitations
-----------

It inherits all the limitations inherent to the USER-INTEL
package or the KOKKOS package, please look at that documentation 
for details.

Reference
---------

There is a preprint describing this work on arXiv.org:
https://arxiv.org/abs/1607.02904

License
-------

The code is licensed in accordance with the LAMMPS copyright 
under the GNU General Public License Version 2 onwards.
The vector math functions in vector_math_neon.h are copyrighted
by Julien Pommier under the zlib license.

lammps-tersoff-vector's People

Contributors

Stargazers

Watchers

Forkers

scc-gatech

lammps-tersoff-vector's Issues

Instructions and Documentation for Power8 and XL

In commit 13, support for power8 and the xl compiler were added. However, accompanying instructions were not added to how to build, execute and experiment with power.

For example, it's unclear whether intel_intrinsics_power8.h must be copied to $LAMMPS_DIR/src. Additionally, for building LAMMPS, I run make power8. However, this results in numerous errors. In particular, the changes in commit 13 are not propagated. For example, the changes to IntelBuffers are not included, nor is the change for the pair_tersoff_intel.cpp.

After manually making the changes in [commit13( ) into $LAMMPS_DIR/src/, and copying all of the new power8 files into $LAMMPS_DIR/src, I obtain the following errors

rs6000_secondary_reload_inner:17011, type = load
(parallel [
        (set (reg:V4SI 82 5)
            (reg:V4SI 34 2))
        (clobber (reg:DI 4 4))
    ])
../pair_tersoff_intel.cpp: In static member function ‘static void IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::kernel_step(int, int, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::iarr, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::iarr, const int*, int, typename LAMMPS_NS::IntelBuffers<flt_t, acc_t>::atom_t*, const typename LAMMPS_NS::PairTersoffIntel::ForceConst<flt_t>::c_inner_t*, const typename LAMMPS_NS::PairTersoffIntel::ForceConst<flt_t>::c_outer_t*, typename LAMMPS_NS::IntelBuffers<flt_t, acc_t>::vec3_acc_t*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec*, int, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::iarr, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::iarr, IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::bvec) [with int EVFLAG = 1; int EFLAG = 1; flt_t = float; acc_t = float; lmp_intel::CalculationMode mic = (lmp_intel::CalculationMode)7u; bool pack_i = false; IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::iarr = int [4]; typename LAMMPS_NS::IntelBuffers<flt_t, acc_t>::atom_t = LAMMPS_NS::IntelBuffers<float, float>::atom_t; typename LAMMPS_NS::PairTersoffIntel::ForceConst<flt_t>::c_inner_t = LAMMPS_NS::PairTersoffIntel::ForceConst<float>::c_inner_t; typename LAMMPS_NS::PairTersoffIntel::ForceConst<flt_t>::c_outer_t = LAMMPS_NS::PairTersoffIntel::ForceConst<float>::c_outer_t; typename LAMMPS_NS::IntelBuffers<flt_t, acc_t>::vec3_acc_t = LAMMPS_NS::IntelBuffers<float, float>::vec3_acc_t; IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::avec = lmp_intel::altivec::fvec; IntelKernelTersoff<flt_t, acc_t, mic, pack_i>::bvec = lmp_intel::altivec::bvec]’:
../pair_tersoff_intel.cpp:886:1: internal compiler error: in rs6000_secondary_reload_fail, at config/rs6000/rs6000.c:16984
 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/cc15wmyP.out file, please attach this to your bugreport.
make[1]: *** [pair_tersoff_intel.o] Error 1
make[1]: Leaving directory `/home/petros/tersoff/lammps-tersoff-vector/machines/lammps-10Mar16/src/Obj_power8'
make: *** [power8] Error 2

Clearly, I am making a mistake in some portion of the build process. It's unclear though, what portion.

You can obtain a repro of the issue in our fork of the repo.

error report: Compiling with ICC 14.0

I tried to compile LAMMPS on our group cluster which has four nodes each of which has two coproccessors. Following the instruction, I received following error message:
In file included from ../pair_tersoff_intel.cpp(61):
../intel_intrinsics.h(262): error: MIC identifier "_mm512_sincos_pd" is undefined
return _mm512_sincos_pd(reinterpret_cast<__m512d *>(cos), a);
^

In file included from ../pair_tersoff_intel.cpp(61):
../intel_intrinsics.h(395): error: MIC identifier "_mm512_sincos_ps" is undefined
return _mm512_sincos_ps(reinterpret_cast<__m512 *>(cos), a);
^

compilation aborted for ../pair_tersoff_intel.cpp (code 2)
make[1]: *** [pair_tersoff_intel.o] Error 2
make[1]: Leaving directory `/home/cormackgroup/Shared/wly/LAMMPS_wly/lammps-16Feb16/src/Obj_intel_phi'
make: *** [intel_phi] Error 2

Any suggestion or help will be highly appreciated.

hpac / lammps-tersoff-vector Goto Github PK

lammps-tersoff-vector's Introduction

lammps-tersoff-vector's People

Contributors

Stargazers

Watchers

Forkers

lammps-tersoff-vector's Issues

Instructions and Documentation for Power8 and XL

error report: Compiling with ICC 14.0

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs