CUDA 8.0 with DualSPHysics 4.0

NWRichmond · December 2016

According to the v4.0 guide, DualSPHysics has been tested with cuda versions up to 7.5. Has anyone had luck with running a compiled version of DualSPHysics using cuda 8.0?

The executable compiled with cuda 7.5 that ships out with the DualSPHysics 4.0 package runs fine on my Ubuntu 16.04 machine with cuda 8.0. Running the code when it's compiled from source with cuda 8.0 is another story. After adjusting the makefile (change line 25 to DIRTOOLKIT=/usr/local/cuda-8.0), I can get the source code to compile without errors, but when I run the code, it is unable to access the cuda runtime libraries. I suspect it's not an issue of the path being set incorrectly because other programs are able to access these libraries with the path set as-is (the path is set as per NVIDIA's recommendations).

Before putting more energy into this, I thought I'd ask the community if anyone's had luck with running DualSPHysics 4.0 compiled from source with cuda 8.0, or if there are fundamental incompatibilities that I'm unaware of.

mashy · December 2016

I managed to compile and run using CUDA 8.0 on Windows with no issues. Haven't tried with my Linux machines yet (that have CUDA 6.5 and 7.0). I wouldn't imagine there should be any fundamental issues though!
Make sure the include path for CUDA is directed to CUDA 8 if you have multiple versions installed. I plan on updating my Linux workstations over Christmas so could possibly offer more detailed help in the new year if you still have no luck by then.

NWRichmond · December 2016

Thanks for the reply, Mashy! I figured out a fix that should apply to most 64-bit Debian-based systems:

It appears that even though the LD_LIBRARY_PATH is set as per NVIDIA's instructions (this can be verified if the command "$ echo $LD_LIBRARY_PATH" returns "usr/local/cuda-8.0/lib64"), the CUDA 8.0 libraries that are pointed to during compilation may not be accessible during runtime. To fix this, add "usr/local/cuda-8.0/lib64" to /etc/ld.so.conf followed by issuance of the command: "$ sudo ldconfig". After that, the libraries pointed to by the DualSPHysics4_linux64 executable (compiled from source) were accessible at runtime.

Now I'm having issues with the compiled DualSPHysics executable returning a segmentation fault. I'll work on a fix, and if I have no luck, I'll post it as a separate question. If I do find a fix, I'll share it here.

Sergey · July 2017

slurmstepd: error: _get_primary_group: getpwnam_r() failed: Numerical result out of range
slurmstepd: error: _initgroups: _get_primary_group() failed
slurmstepd: error: _initgroups: Numerical result out of range
slurmstepd: error: _get_primary_group: getpwnam_r() failed: Numerical result out of range
slurmstepd: error: _initgroups: _get_primary_group() failed
slurmstepd: error: _initgroups: Numerical result out of range
/usr/local/cuda/lib64/::/usr/local/cuda-8.0/lib64/:/usr/lib/nvidia-375/
tesla3

Copyright (C) 2016 by
Dr Jose M. Dominguez, Dr Alejandro Crespo,
Prof. Moncho Gomez Gesteira, Dr Anxo Barreiro,
Dr Benedict Rogers, Dr Georgios Fourtakas, Dr Athanasios Mokos,
Dr Renato Vacondio, Dr Ricardo Canelas,
Dr Stephen Longshaw, Dr Corrado Altomare.

EPHYSLAB Environmental Physics Laboratory, Universidade de Vigo
School of Mechanical, Aerospace and Civil Engineering, University of Manchester

DualSPHysics is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version.

DualSPHysics is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License,
along with DualSPHysics. If not, see .

DualSPHysics4 v4.0.055 (15-04-2016)
====================================
[Select CUDA Device]
Device 0: "Tesla K40m"
Compute capability: 3.5
Multiprocessors: 15 (2880 cores)
Memory global: 11439 MB
Clock rate: 0.75 GHz
Run time limit on kernels: No
ECC support enabled: Yes

[GPU Hardware]
Device default: 0 "Tesla K40m"
Compute capability: 3.5
Memory global: 11439 MB
Memory shared: 49152 Bytes

[Initialising JSphGpuSingle v0.70 09-07-2017 16:36:38]
**Basic case configuration is loaded
**Special case configuration is loaded
Loading initial state of particles...
Loaded particles: 171496
MapRealPos(border)=(0.000985278,0.000985278,0.000985278)-(1.59901,0.672515,0.451515)
MapRealPos(final)=(0.000985278,0.000985278,0.000985278)-(1.59901,0.672515,0.902044)
**Initial state of particles is loaded
**3D-Simulation parameters:
CaseName="Dambreak"
RunName="Dambreak"
PosDouble="1: Uses double and stores in single precision"
SvTimers=True
StepAlgorithm="Verlet"
VerletSteps=40
Kernel="Cubic"
Viscosity="Artificial"
Visco=0.100000
ViscoBoundFactor=1.000000
DeltaSph="None"
Shifting="None"
RigidAlgorithm="None"
FloatingCount=0
CaseNp=171496
CaseNbound=43186
CaseNfixed=43186
CaseNmoving=0
CaseNfloat=0
CaseNfluid=128310
PeriodicActive=0
Dx=0.0085
H=0.014722
CoefficientH=1
CteB=162005.140625
Gamma=7.000000
RhopZero=1000.000000
Cs0=33.6755
CFLnumber=0.200000
DtIni=0.000437186
DtMin=2.18593e-05
DtAllParticles=False
MassFluid=0.000614
MassBound=0.000614
CubicCte.a1=0.318310
CubicCte.aa=6775352.500000
CubicCte.a24=24937.416016
CubicCte.c1=-20326058.000000
CubicCte.c2=-5081514.500000
CubicCte.d1=15244543.000000
CubicCte.od_wdeltap=0.000018
TimeMax=1.5
TimePart=0.01
Gravity=(0.000000,0.000000,-9.810000)
NpMinimum=43186
RhopOut=True
RhopOutMin=700.000000
RhopOutMax=1300.000000
**Requested gpu memory for 171496 particles: 20.9 MB.
CellOrder="XYZ"
CellMode="2H"
Hdiv=1
MapCells=(55,23,31)
DomCells=(55,23,31)
DomCellCode="11_10_11"

BlockSize calculation mode: Empirical calculation.
BsForcesBound=Dynamic (46 regs)
BsForcesFluid=Dynamic (57 regs)

**CellDiv: Requested gpu memory for 180070 particles: 1.4 MB.
**CellDiv: Requested gpu memory for 4992 cells (CellMode=2H): 0.1 MB.
RunMode="Pos-Double, Single-Gpu, HostName:tesla3"
Allocated memory in CPU: 15434640 (14.72 MB)
Allocated memory in GPU: 23489952 (22.40 MB)
Part_0000 171496 particles successfully stored

[Initialising simulation (solxji9b) 09-07-2017 16:36:40]
PART PartTime TotalSteps Steps Time/Sec Finish time
========= ============ ============ ======= ========= ===================
/var/spool/slurmd/job00010/slurm_script: line 27: 6242 Segmentation fault $GpuSphProcessor $dirout/$name $dirout -svres -gpu
Execution aborted

NWRichmond · July 2017

Sergey, I also get a seg fault from the compiled DualSPHysics on 64-bit Linux. If you figure out a solution, please share it.

Sergey · July 2017

ubuntu-16.04, CUDA 8.0

Sergey · July 2017

" If you figure out a solution, please share it." - I did not understand what you wrote

NWRichmond · July 2017

In other words, if you figure out a way to avoid the segmentation fault, please let me know. Good luck!

tverbrug · October 2017

Hello, is there any news on this segementation fault issue? I am also running ubuntu 16.04 and have the same problem. I tried installing 14.04 instead but my graphics card doesn't support CUDA 7.5

PreDead · October 2018

Hi There

First post and I hope you don't mind me reviving this old thread - but it seemed relevant. I am quite interested in DualPhysics and I am trying to get it running on Debian Stretch. I installed CUDA 8 from the standard repositories and tested it with a simple CUDA example. So, all seems to be working. The one caveat being that the Debian package seems to require nvcc to be run with the -ccbin clang-3.8.

The packaged binaries with V 4.2 don't run and, if memory serves me, the error lead me to believe that they were built with a newer version of CUDA. Rebuilding DualPhysics seems to be the best option but it seems that there is an issue with clang and DualPhysics.

My question is has anyone had success with V 4.2 and the stock CUDA 8.0 and vanilla Debian stretch? BTW, my card (Quadro 4000) is a bit dated and Nvidia dropped support after CUDA 8.

Thanks
PreDead

agavino · November 2018

I also want to contribute as I think still there are people who has to face this problem.

DUALSPHYSICS 4.2 - COMPILATION FOR CUDA FERMI ARQUITECTURE (CUDA 8.0, P.EJ. QUADRO 2000) IN WINDOWS 10

- Downloadel development toolkit (SDK) for Windows 10 SP1 (6.1.7601) and install.
- Downloadel development toolkit for CUDA 8.0 and install.
(It will be present at a folder like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0)
- Create system variable CUDA_PATH_V8_0 containing the string with the path of CUDA 8.0 folder.
- Install Visual Studio 2015.
- Change within the project DualSPHysics4Re.vcxproj the following parameters:
- Windows 10 SDK version:
10.0.17134.0 -> 6.1.7601
- Visual Studio a 2015 version:
Ver141 -> Ver140
- CUDA instalation folder:
$(CUDA_PATH_V9_2) -> $(CUDA_PATH_V8_0)
If needed, also indicate proper subfolders.
- CUDA 8 version for Fermi arquitecture (up to CUDA 8):
Add
compute_20,sm_20;compute_21,sm_21;
before
compute_30,sm_30;compute_35,sm_35;compute_50,sm_50;compute_52,sm_52;compute_61,sm_61;compute_70,sm_70
Change
compute_30,sm_30 -> compute_21,sm_21

agavino · November 2018

I forgot to highlight that the compilation is for DualSPHysics 4.2.
Unfortunately, I couldn't make work DualSPHysics4.0_LiquidGas

PreDead · November 2018

Agavino, thanks for the detailed reply. I was wondering about Debian but I do see some things in your post for me to try. When I get back to the machine I will make another attempt. I will report back!

CUDA 8.0 with DualSPHysics 4.0

Comments