Anyone familiar with Julia?

edited April 2019 in DualSPHysics v4.4
Hey guys

I've been writing a function in Julia which allows a user to read the binary vtk files into Julia and work with the data inside there. Currently, I've been testing on "03_MovingSquare" with default settings and exporting all variables outlike this:

%partvtk% -dirin %diroutdata% -savevtk %dirout2%/PartSquare -onlytype:-all,moving -vars:+all if not "%ERRORLEVEL%" == "0" goto fail

Which basically gives me binary vtk files of size 202 KB, with n = 251 files, it becomes a total amount of 50.702 MB. Using my function in Julia I am able to import mass and acceleration as fast as:

@time k = readVtkArray("PartSquare",Mass); 0.056765 seconds (10.17 k allocations: 5.545 MiB) @time k = readVtkArray("PartSquare",Ace); 0.055016 seconds (10.42 k allocations: 10.644 MiB)

Which for both cases gives a read speed of about 50.702 MB / 0.056 s =905 MB/s

Which in my opinion is pretty fast. Using these arrays I am then able to make plots like this (which are insertable in LaTeX instantly by code) (see included pdf for high quality:



So why am I telling this? I am trying to develop a tool using a free open-source language (Julia) which will make it easier to work with DualSPHysics, instead of having to rely on CSV files, using Julia I am able to only export vtk binary files and use them. Does anyone know Julia and would anyone be interested in this?

Kind regards

Comments

  • Nicely done! I really like the idea of working with DualSPHysics outputs in the Julia environment.
    Public Service Announcement: If you're reading this and you know some Python, you might really like Julia for fast & elegant scientific computing and data analysis.
    Will this project be available on GitHub? Are there more post-processing tasks you'd like to achieve in Julia?

    I'm working on some pre-processing tools for DualSPHysics written in Python.
    It's tempting for me to switch the codebase to Julia, but the fact that Python is already in the DualSPHysics ecosystem (on the pre-processing [DesignSPHysics] and post-processing [VisualSPHysics] sides) makes me feel inclined to keep it in Python.
  • Hello!

    Exactly, I've chosen to use Julia since I have a background in Matlab, and using Julia have made me able to get better with Python as well, since Julia has some more similarities with Python like, "for i in array" notation etc. which is not possible in Matlab. I've chosen to do it in Julia since I was advised that it was a high speed language with syntax close to what I was used to. The fact that it is open-source also makes it so much easier for group collaboration (no licenses most of the time etc.) .

    I am trying to understand the whole process of making a Github repository, making a package in Julia, but I would be very willing to give back to the community by making this package. Currently my brother and I are working on it, but it would be awesome to work with more people.

    Some other post-processing tools / tools I would like to make are:
    1. Read binaries directly (possible, have done it, but slow compared to vtk)
    2. Make some automated tools for force plots, imagine you have 10 floats in a simulation and want to make a force plot of all of them all the time, becomes tiresome very quickly
    3. Making statistics tools for post-processing ie. how big is density fluctuation from time step to time step, how does the total mass of simulation change with time etc.
    Regarding pre-processing tools I understand your choice to want to stay in Python, I just feel that some more performance might be able to be derived from Julia compared to Python and also I believe the Juno (Atom GUI) is very nice and makes it very "Matlab-by" to work with developed tools. Also it seems like that Julia have XML reading tools, so it should be possible to work with XML in Julia as well.

    I will probably stay using Julia since it will ease my workflow, especially the multiple dispatch system:



    So I can give a function the same name, but different argument types ie. Array{Float32,1} or Array{Float32,2} instead of having to determine type etc. Then Julia very efficiently will choose the right function to use.

    Sorry for the long post, just got excited someone using DualSPHysics also knows Julia :-)

    Kind regards
  • edited April 2019
    Great! There's a lot to unpack here, so I'll just stick to the GitHub repository issue for now since it's low-hanging fruit. There's some great documentation on GitHub for creating a new repository. It's a painless process and can be done very quickly.

    It really pays dividends to start with a good README explaining what the code is designed to do, preferably with examples of how it works. There are many guides for creating excellent READMEs, and while some of the examples are very thorough, it's fine to start simple.

    Creating a Julia package is a good idea, and it's an idea that could be added to a GitHub Project Board within your repository.

    I'm a bit over-committed to various code projects at the moment, but I'd be glad to contribute to this project on an intermittent basis.
  • Thanks for your comment! I will look into it and probably start doing a README on Github to be prepared, I have some questions you might be able to help me with though:
    1. Regarding naming, I am considering something with DualSPHysics just like "DesignSPHysics", but maybe something like "DataSPHysics", having a hard time finding a suitable name, and I think "DualSPHysics" is a bit too aggressive, regarding expectations - maybe I have to consider a name without DualSPHysics if they do not want to be associated or a more general one like "VtkPost"
    2. About licensing if I wan't it to be open-source with attribution (ie. the code is made by me / Github group), which licenses would make sense to chose? MIT or something which matches DualSPHysics, which is a GPL?
    I will start making some baby steps trying to outline current possibilities and future goals. Thanks for your time!

    Kind regards
  • Hello,
    This is a very nice conversation here !
    I'm interested in helping to develop both pre and post tools.
    Keep us posted, please !

    I've never heard of Julia before but I'm eager to learn as well as managing github repository. Right now I'm mostly using Python for my post processing.

    @NWRichmond what kind of pre processing tool are you developping ?

    Kind regards
  • Thanks for joining TPouzol! What are you currently doing when post-processing in Python and how is the performance?

    Also I've made the Github resp. and started to work on the readme and later this week I hope to upload the first package in Julia, but might change.

    https://github.com/AhmedSalih3d/PostSPH.jl

  • Thanks,
    I've coded a csv importer.
    Currently, I run the same simulation with only a few tweaks (viscosity parameters, geometry of channel,...), and I generate the same csv over and over (flow, velocity). So my python script import them and analyse them for me.
    Relativly to the time it takes to produces data (simulation and csv export) the python script is fast enough. However, I have a feeling it is not really efficient (it only uses 1 physical core of my CPU for exemple. I know it's possible to have paralel computing in python but it's not worth the time in my case).
  • Awesome, hoping you would be willing to test a bit with Julia when I manage to do a package properly :-) If you chose to install Julia, I advise this approach:

    http://docs.junolab.org/latest/man/installation/

    Using Juno through Atom as an IDE for Julia, has been the most pleasant experience for me.

    It will take me a bit of time to get a package done properly.

    Kind regards
  • Thanks for the advice.
    I'll test this as soon as possible (high worload for now...)

    Kind regards
  • I'm wondering why you are working with vtk files ? they use extra space and need extra time.
    I use the export to csv files but I have three main problems:
    - the export tool from DSPH only uses 1 physical core of the CPU (I could be wrong), having paralel computing could increase speed there (I do not wich to have a GPU version of it since I often export data while still computing the simulation)
    - data are loaded and scanned for each type of export you want (flow + velocity = 2 exports), maybe doing only one loop of the data and doing all the export at the same time could speed things up ?
    - csv files need to be re-imported again for analysis, having option to other format ready for other kind of programs could be great ? or having a csv format more friendly for import (like in a table rather than a matrix, even if it consumes more space)?

    Did you run your current version of PostSPH with a big simulation (lots of particles and lots of dts). it could be a problem regarding memory usage.

    I'm just sharing ideas and needs, not making demands. They could be stupid and useless to other (especially since I'm writing this late in the night).

    Kind regards
  • @TPouzol good questions, here are my answers:

    Why not export all data from a file at a time?

    The reason for this is that often it might only be necessary to use three parameters, velocity, mass and acceleration (or maybe something else) and then having to read more would be inefficient. It would then be better to be able to pick and choose individually. Maybe making a version of a function which reads all could also be an option.

    Why work with .vtk files directly?

    The reason to use .vtk files is that .vtk files can store the same information as a .csv file, but in binary format which means a .vtk file in some cases only uses 20 % of space a .csv file would use. Not reading from .bi4 directly is because only four properties are stored in .bi4 files and then math would have to be done anyways. Also reading from .bi4 files have not been as fast as reading from .vtk currently.

    Reimporting csv files?

    Using this program it is not necessary to reload .csv files since there is no need to generate them for being able to read the data. The exact same data is in the binary .vtk files.

    Memory Consumption?

    You are right, reading a whole simulation of 500 steps with a lot of data, can take a lot of ram up. A solution for this is doing the math instantly and then discarding the data.

    I have made an initial offline package, will try to make it available online as soon as possible. Thanks for your interest. One of the most important things will be to benchmark the tool.

    Currently the readme is available here:

    https://github.com/AhmedSalih3d/PostSPH.jl

    Kind regards
  • @NWRichmond, I have some great news - the package, "PostSPH", with a readme.md is now released under MIT license. It can be installed using following commands:

    ] add https://github.com/AhmedSalih3d/PostSPH.jl

    Bracket, "]", is to enter pkg mode. Then you can run function on for example a "PartFloating", using this read command:



    Use "Cat(0)" for "Points" / position instead. Currently if you try to read a property which is not existant in a file Julia will crash, but will weed that out later. For now I just hope you would like to try it. A complete readme is found at:

    https://github.com/AhmedSalih3d/PostSPH.jl


    Kind regards
  • Nicely done! I look forward to using it. Thank you for creating a GitHub repo and Julia package - that will make it so much easier for others to test and contribute to.
  • This looks very interesting
    Can I please suggest you to present this new tool at the 5th DualSPHysics Users Workshop, Universitat Politècnica de Catalunya, Barcelona, Spain, 23-25 March 2020.?

    Regards
  • @NWRichmond looking forward to your thoughts, I think the next step is ensuring the low hanging fruits are completely finished and then show-casing some more examples, since I really like the direction it is going, with possibility to use some commands to gather the available vtk files in a folder and then run the parts which are of interest, ie. force calculation automatically. Hope you play around with it a bit - any questions feel free to ask, can also provide some examples.

    @Alex, that sounds like an interesting prospect! A year should also be enough to make something solid, so let us talk about it again when it gets closer.

    Kind regards
Sign In or Register to comment.