CUDA 8.0 with DualSPHysics 4.0

edited December 2016 in DualSPHysics_v4.0
According to the v4.0 guide, DualSPHysics has been tested with cuda versions up to 7.5. Has anyone had luck with running a compiled version of DualSPHysics using cuda 8.0?

The executable compiled with cuda 7.5 that ships out with the DualSPHysics 4.0 package runs fine on my Ubuntu 16.04 machine with cuda 8.0. Running the code when it's compiled from source with cuda 8.0 is another story. After adjusting the makefile (change line 25 to DIRTOOLKIT=/usr/local/cuda-8.0), I can get the source code to compile without errors, but when I run the code, it is unable to access the cuda runtime libraries. I suspect it's not an issue of the path being set incorrectly because other programs are able to access these libraries with the path set as-is (the path is set as per NVIDIA's recommendations).

Before putting more energy into this, I thought I'd ask the community if anyone's had luck with running DualSPHysics 4.0 compiled from source with cuda 8.0, or if there are fundamental incompatibilities that I'm unaware of.

Comments

  • I managed to compile and run using CUDA 8.0 on Windows with no issues. Haven't tried with my Linux machines yet (that have CUDA 6.5 and 7.0). I wouldn't imagine there should be any fundamental issues though!
    Make sure the include path for CUDA is directed to CUDA 8 if you have multiple versions installed. I plan on updating my Linux workstations over Christmas so could possibly offer more detailed help in the new year if you still have no luck by then.
  • Thanks for the reply, Mashy! I figured out a fix that should apply to most 64-bit Debian-based systems:

    It appears that even though the LD_LIBRARY_PATH is set as per NVIDIA's instructions (this can be verified if the command "$ echo $LD_LIBRARY_PATH" returns "usr/local/cuda-8.0/lib64"), the CUDA 8.0 libraries that are pointed to during compilation may not be accessible during runtime. To fix this, add "usr/local/cuda-8.0/lib64" to /etc/ld.so.conf followed by issuance of the command: "$ sudo ldconfig". After that, the libraries pointed to by the DualSPHysics4_linux64 executable (compiled from source) were accessible at runtime.

    Now I'm having issues with the compiled DualSPHysics executable returning a segmentation fault. I'll work on a fix, and if I have no luck, I'll post it as a separate question. If I do find a fix, I'll share it here.
  • slurmstepd: error: _get_primary_group: getpwnam_r() failed: Numerical result out of range
    slurmstepd: error: _initgroups: _get_primary_group() failed
    slurmstepd: error: _initgroups: Numerical result out of range
    slurmstepd: error: _get_primary_group: getpwnam_r() failed: Numerical result out of range
    slurmstepd: error: _initgroups: _get_primary_group() failed
    slurmstepd: error: _initgroups: Numerical result out of range
    /usr/local/cuda/lib64/::/usr/local/cuda-8.0/lib64/:/usr/lib/nvidia-375/
    tesla3


    Copyright (C) 2016 by
    Dr Jose M. Dominguez, Dr Alejandro Crespo,
    Prof. Moncho Gomez Gesteira, Dr Anxo Barreiro,
    Dr Benedict Rogers, Dr Georgios Fourtakas, Dr Athanasios Mokos,
    Dr Renato Vacondio, Dr Ricardo Canelas,
    Dr Stephen Longshaw, Dr Corrado Altomare.

    EPHYSLAB Environmental Physics Laboratory, Universidade de Vigo
    School of Mechanical, Aerospace and Civil Engineering, University of Manchester

    DualSPHysics is free software: you can redistribute it and/or
    modify it under the terms of the GNU General Public License as
    published by the Free Software Foundation, either version 3 of
    the License, or (at your option) any later version.

    DualSPHysics is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License,
    along with DualSPHysics. If not, see .


    DualSPHysics4 v4.0.055 (15-04-2016)
    ====================================
    [Select CUDA Device]
    Device 0: "Tesla K40m"
    Compute capability: 3.5
    Multiprocessors: 15 (2880 cores)
    Memory global: 11439 MB
    Clock rate: 0.75 GHz
    Run time limit on kernels: No
    ECC support enabled: Yes

    [GPU Hardware]
    Device default: 0 "Tesla K40m"
    Compute capability: 3.5
    Memory global: 11439 MB
    Memory shared: 49152 Bytes

    [Initialising JSphGpuSingle v0.70 09-07-2017 16:36:38]
    **Basic case configuration is loaded
    **Special case configuration is loaded
    Loading initial state of particles...
    Loaded particles: 171496
    MapRealPos(border)=(0.000985278,0.000985278,0.000985278)-(1.59901,0.672515,0.451515)
    MapRealPos(final)=(0.000985278,0.000985278,0.000985278)-(1.59901,0.672515,0.902044)
    **Initial state of particles is loaded
    **3D-Simulation parameters:
    CaseName="Dambreak"
    RunName="Dambreak"
    PosDouble="1: Uses double and stores in single precision"
    SvTimers=True
    StepAlgorithm="Verlet"
    VerletSteps=40
    Kernel="Cubic"
    Viscosity="Artificial"
    Visco=0.100000
    ViscoBoundFactor=1.000000
    DeltaSph="None"
    Shifting="None"
    RigidAlgorithm="None"
    FloatingCount=0
    CaseNp=171496
    CaseNbound=43186
    CaseNfixed=43186
    CaseNmoving=0
    CaseNfloat=0
    CaseNfluid=128310
    PeriodicActive=0
    Dx=0.0085
    H=0.014722
    CoefficientH=1
    CteB=162005.140625
    Gamma=7.000000
    RhopZero=1000.000000
    Cs0=33.6755
    CFLnumber=0.200000
    DtIni=0.000437186
    DtMin=2.18593e-05
    DtAllParticles=False
    MassFluid=0.000614
    MassBound=0.000614
    CubicCte.a1=0.318310
    CubicCte.aa=6775352.500000
    CubicCte.a24=24937.416016
    CubicCte.c1=-20326058.000000
    CubicCte.c2=-5081514.500000
    CubicCte.d1=15244543.000000
    CubicCte.od_wdeltap=0.000018
    TimeMax=1.5
    TimePart=0.01
    Gravity=(0.000000,0.000000,-9.810000)
    NpMinimum=43186
    RhopOut=True
    RhopOutMin=700.000000
    RhopOutMax=1300.000000
    **Requested gpu memory for 171496 particles: 20.9 MB.
    CellOrder="XYZ"
    CellMode="2H"
    Hdiv=1
    MapCells=(55,23,31)
    DomCells=(55,23,31)
    DomCellCode="11_10_11"

    BlockSize calculation mode: Empirical calculation.
    BsForcesBound=Dynamic (46 regs)
    BsForcesFluid=Dynamic (57 regs)

    **CellDiv: Requested gpu memory for 180070 particles: 1.4 MB.
    **CellDiv: Requested gpu memory for 4992 cells (CellMode=2H): 0.1 MB.
    RunMode="Pos-Double, Single-Gpu, HostName:tesla3"
    Allocated memory in CPU: 15434640 (14.72 MB)
    Allocated memory in GPU: 23489952 (22.40 MB)
    Part_0000 171496 particles successfully stored

    [Initialising simulation (solxji9b) 09-07-2017 16:36:40]
    PART PartTime TotalSteps Steps Time/Sec Finish time
    ========= ============ ============ ======= ========= ===================
    /var/spool/slurmd/job00010/slurm_script: line 27: 6242 Segmentation fault $GpuSphProcessor $dirout/$name $dirout -svres -gpu
    Execution aborted
  • Sergey, I also get a seg fault from the compiled DualSPHysics on 64-bit Linux. If you figure out a solution, please share it.
  • ubuntu-16.04, CUDA 8.0
  • " If you figure out a solution, please share it." - I did not understand what you wrote
  • In other words, if you figure out a way to avoid the segmentation fault, please let me know. Good luck!
  • Hello, is there any news on this segementation fault issue? I am also running ubuntu 16.04 and have the same problem. I tried installing 14.04 instead but my graphics card doesn't support CUDA 7.5
  • Hi There

    First post and I hope you don't mind me reviving this old thread - but it seemed relevant. I am quite interested in DualPhysics and I am trying to get it running on Debian Stretch. I installed CUDA 8 from the standard repositories and tested it with a simple CUDA example. So, all seems to be working. The one caveat being that the Debian package seems to require nvcc to be run with the -ccbin clang-3.8.

    The packaged binaries with V 4.2 don't run and, if memory serves me, the error lead me to believe that they were built with a newer version of CUDA. Rebuilding DualPhysics seems to be the best option but it seems that there is an issue with clang and DualPhysics.

    My question is has anyone had success with V 4.2 and the stock CUDA 8.0 and vanilla Debian stretch? BTW, my card (Quadro 4000) is a bit dated and Nvidia dropped support after CUDA 8.

    Thanks
    PreDead
  • I also want to contribute as I think still there are people who has to face this problem.

    DUALSPHYSICS 4.2 - COMPILATION FOR CUDA FERMI ARQUITECTURE (CUDA 8.0, P.EJ. QUADRO 2000) IN WINDOWS 10

    - Downloadel development toolkit (SDK) for Windows 10 SP1 (6.1.7601) and install.
    - Downloadel development toolkit for CUDA 8.0 and install.
    (It will be present at a folder like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0)
    - Create system variable CUDA_PATH_V8_0 containing the string with the path of CUDA 8.0 folder.
    - Install Visual Studio 2015.
    - Change within the project DualSPHysics4Re.vcxproj the following parameters:
    - Windows 10 SDK version:
    10.0.17134.0 -> 6.1.7601
    - Visual Studio a 2015 version:
    Ver141 -> Ver140
    - CUDA instalation folder:
    $(CUDA_PATH_V9_2) -> $(CUDA_PATH_V8_0)
    If needed, also indicate proper subfolders.
    - CUDA 8 version for Fermi arquitecture (up to CUDA 8):
    Add
    compute_20,sm_20;compute_21,sm_21;
    before
    compute_30,sm_30;compute_35,sm_35;compute_50,sm_50;compute_52,sm_52;compute_61,sm_61;compute_70,sm_70
    Change
    compute_30,sm_30 -> compute_21,sm_21
  • I forgot to highlight that the compilation is for DualSPHysics 4.2.
    Unfortunately, I couldn't make work DualSPHysics4.0_LiquidGas :/
  • Agavino, thanks for the detailed reply. I was wondering about Debian but I do see some things in your post for me to try. When I get back to the machine I will make another attempt. I will report back!
Sign In or Register to comment.