GPUs performance evaluation

Following the discussion here: forums.dual.sphysics.org/discussion/1846

A few of us (@Asalih3d ), users of DSPH, wanted to evaluate the performance of hardware setup. This would help people deciding if they should upgrade their hardware.

As a first step, we choose to run the dambreak example as it is simple.

Here is the xml code I've run

It is set with dp=0.01, I've run test with dp from 0.01 to 0.005 every 0.001.

Here are the results (3xGTX1080ti +2xGTX1060)

Clockspeed are not the spec ones, but the ones I read under load with GPU-Z tool (don't forget to "pre-heat" your GPU as the actual clockspeed shown in sensors varies at the beginning). I did not detect any thermal throttling.

I've looked at the theoretical performance delta between two hardware setups and the actual (averaged between all simulations) performance delta.

Performances of the 3 1080ti are consistant between each other (different manufacturers and different CPUs).

Comparing 1080ti and 1060, the theoretical and actual delta are different (1060 perform 12% better than expected)

Feel free to crunch my numbers differently and discuss.

It would be very beneficial if other people could run the same test, especially with RTX series (2000 and 3000...).

Kind regards.

Comments

  • Awesome!

    I hope to get my results for my weaker GPU in the weekend to compare with. If anyone has access to 2000 and 3000 series and wants to share their results, it would also be quite nice. For now don't bother too much about exact version, just ensure it is atleast version 5.

    For now I think we keep results here in this thread, and later on perhaps we should migrate it to a github library. Unfortunately I am quite busy at the time.

    Kind regards

  • What is the perf row in the spreadsheet?

  • HI @jonnilehtiranta

    Perf for performance it is the product of numbers of cuda cores and clock speed.

  • Are you running with double-precision or single-precision? Some people are not aware that the number of cores quoted for each card is actually the number of single-precision cores. The number of double-precision cores, and hence performance, is dramatically less for consumer-grade (GTX, RTX) cards. This article explains why: https://arrayfire.com/explaining-fp64-performance-on-gpus/

    That's why I bought a Titan Black, which still has competitive double-precision price / performance when set to double-precision mode.

    This article also points out that the double-precision performance of consumer-grade AMD graphics cards is dramatically better than for Nvidia. Any chance of DualSPHysics being ported to AMD graphics cards? This would be a popular move as to get high-performance double-precision performance from Nvidia costs an arm and a leg, not only because their professional cards are much more expensive, but also because they don't have graphical output: Thus forcing you to have a motherboard that can handle two graphics devices simultaneously AND with two separate PCIE channels to maintain the data bandwidth with the graphics card doing the heavy processing. Unless you buy a Titan Black.

  • edited February 8

    @MikeHersee As far as I understood, since DualSPHysics 5.0 positions are always in double precision and velocity and density always in single (https://forums.dual.sphysics.org/discussion/1812/) I wondered what the penalty in terms of compute time is after this choice, but eventually I did not test this (also because you'd need to run the older DSPH 4.4, and DSPH is being refactored at places from time to time, so the comparison would not be too sound).

    Long story short, my intuition is that now there is always a double precision core: this is a bottleneck to consider when buying a GPU card.

    Reinforcing your message, for example the conceptual map of a Pascal architecture below shows the single precision cores in light green and the double precision cores in yellow. They are distinct processors and are half as much. No surprise that the compute throughput in double precision is half as much as that in single precision.


  • @MikeHersee @sph_tudelft_nl

    Thanks to both of you for your input ! I initiated this post for the exact reasons you mention: it is not easy to predict GPU performance regarding DSPH (mix of single vs double precision)

    The titan black seems like a nice deal, but is very hard to find nowadays.

    Would you be willing to run the standard DamBreak example (it is fast) on v5.0 and share your GPU model and results so we can shed some light on this subject and help people make sound investment when starting using DSPH ?

    Thanks

  • @jmdalonso can include here some figures of the performance with different GPU he has been performing lately

Sign In or Register to comment.