Nvidia Tesla GPUs stop working at high particle numbers

edited November 2016 in DualSPHysics_v4.0
Hello,

I already contacted the developers of DualSPHysics, but maybe someone of the other users here encountered the same problem and knows a solution.

I am trying to simulate a comlex floating body, which works very well. Unfortunately the comparison with experimental data shows a big difference in the pitch angles, so we need to do the simulation with a much finer resolution.
We use a Nvidia Tesla K80 and a K40, but the limit of simulation seems to be at ~7 million particles. If we want to get finer, the simulation stops without giving any reason. GenCase creates the initial setup, but the DualSPHysics code doesn't start or stops directly after it started. This error occurs on both cards.

This is how the Run.out file looks and the output file of the server looks as well:
--------------------------------

GenCase4 v4.0.027 (11-05-2016)
================================

...


Allocated memory in CPU: 1051300270 (1002.60 MB)
Allocated memory in GPU: 2066597824 (1970.86 MB)
Part_0000 11110779 particles successfully stored

[Initialising simulation (on3cg6z4) 11-11-2016 22:59:04]
PART PartTime TotalSteps Steps Time/Sec Finish time
========= ============ ============ ======= =========

--------------------------------
So as you can see, no error message or anything.

According to the developers, we did nothing wrong with the setup of the simulation, so maybe we have to change some settings of the GPUs to reach their computing limit?

Regards
Jannik

Comments

  • Right after posting this, our server admin contacted me and told me that there was a problem with the allocated memory in CPU, which we could solve easily.

    So no problem with DualSPHysics and no problem with the Tesla GPU.

    Regards
    Jannik
  • Next time you are working on a server and you can not read any log file with the error information, please try first with a desktop machine so maybe there you can read more information.

    Happy to read that the problem was finally solved, Jannik

    Regards
  • Last what I performed on tesla k40 was 65 000 000 particles
    Multiprocessors 15 (2880 cores)
    Memory global 11519 MB
    What limit for this device, I do't inspect
Sign In or Register to comment.