Compilation flags in arch.mk
for GPU offloading
BerkeleyGW supports offloading on three GPU hardware, namely NVIDIA, AMD and Intel.
Below, we describe the flags necessary in arch.mk
to compile BerkeleyGW
enabling GPU offload. Also, refer to the examples in the config/
directory, in
particular:
perlmutter.nvhpc.gpu.nersc.gov.mk
NVIDIA GPUsfrontier.cray.gpu.ornl.gov.mk
AMD GPUsaurora.intel.gpu.alcf.gov.mk
Intel GPUs
Generally you need to specify a compiler, a library's API and an offloading programming model (OpenACC and/or OpenMP-target).
Compilation on NVIDIA Architectures
The preferred compiler option on NVIDIA architectures is the nvhcp
compiler suite (usually provided within the PrgEnv-nvhpc
module).
The preferred library's API is the NVHPC_API
(based on CUDA
) usually automatically provided in the linking and include path within NVHPC-SDK
distribution.
To use the above combination set:
COMPFLAG = -DNVHPC -DNVHPC_API -DNVIDIA_GPU
nvhcp
compiler support both OpenACC and OpenMP-target programming model, you can compile in just one of them or both, in the latter case the
default programming model (ALGO
) is OpenACC, and OpenMP-target can be turned by setting the appropriated flags in input. Make sure to include
-DOPENACC
and/or -DOMP_TARGET
flags on the list after PARAFLAG =
(or MATHFLAG =
).
Typical Fortran compiler flags are:
F90free = ftn -Mfree -acc -mp=multicore,gpu -gpu=cc80 -cudalib=cublas,cufft -traceback -Minfo=all,mp,acc -gopt
LINK = ftn -acc -mp=multicore,gpu -gpu=cc80 -cudalib=cublas,cufft
FOPTS = -fast -Mfree -Mlarge_arrays
Compilation on AMD Architectures
The preferred compiler option on AMD architectures is the Cray
compiler suite (usually provided within the PrgEnv-cray
module).
The preferred library's API is the HIP_API
(based on HIP
).
To use the above combination set:
COMPFLAG = -DCRAY -DHIP_API -DAMD_GPU
If the HIP_API
is not usually automatically provided in the linking and include path within Cray
or AMD
distribution you can build it
yourself by following these steps:
- Clone the repository: git@github.com:ROCmSoftwarePlatform/hipfort.git
- Create a build directory and access it: mkdir build ; cd build
- Use the CMake to configure the library's API: cmake ../ -DHIPFORT_COMPILER='ftn' -DHIPFORT_COMPILER_FLAGS='-f free -fopenmp -g -eF ' -DHIPFORT_INSTALL_DIR=<Path_to_lib>
- Build using: make -j 8
Make sure to include the path to the library's API in the arch.mk
file (${ROCM_PATH}
is the path to the ROCm
library which should be automatically provided by rocm
module):
HIP_INC = -J/<Path_to_lib>/include/hipfort/amdgcn/ -I${ROCM_PATH}/include/
HIP_LIB = /<Path_to_lib>/lib/libhipfort-amdgcn.a -L${ROCM_PATH}/lib -lamdhip64 -lhipfft -lhipblas
Cray
compiler support both OpenACC and OpenMP-target programming model, you can compile in just one of them or both, in the latter case the
default programming model (ALGO
) is OpenACC, and OpenMP-target can be turned by setting the appropriated flags in input. Make sure to include
-DOPENACC
and/or -DOMP_TARGET
flags on the list after PARAFLAG =
(or MATHFLAG =
).
Typical Fortran compiler flags are:
F90free = ftn -f free -h acc -homp -g -ef -hacc_model=auto_async_none:no_fast_addr:no_deep_copy ${HIP_INC} ${HIP_LIB}
LINK = ftn -f free -h acc -homp -g -ef -hacc_model=auto_async_none:no_fast_addr:no_deep_copy ${HIP_INC} ${HIP_LIB}
FOPTS = -O1
Compilation on Intel Architectures
The preferred compiler option on Intel architectures is the Intel oneapi
suite and ifx
compiler (usually provided within the oneapi/release/YYY.MM.DD.vvvv
module). The preferred library's API is the ONE_API
(which incorporate MKL
). Usually automatically provided in the linking and include path within oneapi/release
distribution.
To use the above combination set:
COMPFLAG = -DINTEL -DINTEL_GPU -DONE_API
ifx
compiler support only OpenMP-target programming model. Make sure to include -DOMP_TARGET
flags on the list after PARAFLAG =
(or MATHFLAG =
).
F90free = mpif90 -fc=ifx -free
LINK = mpif90 -fc=ifx -free
FOPTS = -O2 -g -traceback -check shape -fp-model precise -no-ipo -align array64byte -fiopenmp -fopenmp-targets=spir64 -qmkl=sequential -lmkl_sycl -lsycl -lOpenCL
- And append to the library path:
-qmkl=sequential -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -lm -ldl