Processing OpenFOAM in Parallel
This post is about a wonder I've come across - parallel visualization/processing of large CFD data sets using Paraview. My postdoctoral work involving large-eddy simulations of full-scale river confluences challenged me to figure out how to manage, manipulate and visualize the vast quantity of data 30+ million cells meshes can produce. The parallel capabilities of Paraview have let me speed things up considerably and if you’re not currently leveraging them, I would definitely recommend giving it a try.
Background
I began my postdoc with the same serial post-processing workflow I had used in my previous projects which worked well for relatively small domain sizes (< 5 million). The typical workflow: send case to server, decompose mesh into x number of cores, launch simulation, wait, reconstruct a few fields from the last time-step, download reconstructed fields to my local machine and view them as a reconstructed case in Paraview. Though relatively effective for smaller datasets, this system rapidly breaks down when confronted with visualizing transient features of mixing processes at confluences modeled on a large domain (> 30 million unstructured cells). Using Paraview's volumetric rendering, or any computationally intensive operation, or even just loading the data on one core was taking forever.
I needed a path out of the madness. So after some research, the true power of Paraview was finally revealed to me - remote parallel processing.
The parallel workflow: you create a local instance of Paraview (at home, work, university) that connects to a remote parallelized instance of Paraview on the server which reads and processes the decomposed data in parallel. Your GUI actions are sent from the local instance of Paraview to the server, which then uses its cores to chew through the processing. Depending on the number of cores, processing can take fractions of the time required on a single core.
Instead of transferring data between server and client, the remote instance of Paraview (server) sends back only the rendered viewport for display on your local Paraview GUI (client). Large data transfers are eliminated as well as the need to reconstruct fields - completing a workflow that would have taken me on the order of 1 -2 days (mostly background downloading and processing) into an operation typically less than an hour. Furthermore, the parallellization results in much more interactivity (quickly switch between time-steps of large unstructured datasets), thus greatly improving the experience.
Steps:
If you have access to a cluster, it is very likely you can already set up parallel processing quite easily. Have a look at these instructions on the Compute Canada website for instructions on how to set up the remote instance of Paraview. More than likely, your cluster uses similar, if not identical commands to those listed there. The first time I tried these instructions on the Béluga cluster located at the École Polytechqnique de Montréal, they worked perfectly. If you run into trouble, send an email to a system administrator and odds are they will be able to help you set it up.
On the server, download the binary files for the most recent headless version of Paraview with osmesa and mpi in the name here. At the time of writing the name of the file I downloaded was ParaView-5.8.1-osmesa-MPI-Linux_Python3.7-64bit.tar.gz:
Un-tar the downloaded file, say in your home directory, rename the folder something identifiable such as 'paraviewHeadless'. Once renamed, run the following command from your home directory: ./paraViewHeadless/bin/mpiexec -np XX ./paraViewHeadless/bin/pvserver, where XX is the number of cores the OpenFOAM case has been decomposed into (must be equal or less than the number of cores available on the server!). This will set up the pvserver (paraview server) which will wait for you to establish a connection (done in a later step). After a few moments, some information will print out on the terminal, take note of the letters in front of :11111 (in my case jason-Precision-T7610), you will need them in step XX.
In a terminal on your local computer, use ssh to establish a connection on the 11111 port by typing something similar to this on the command line:
ssh jason@192.168.0.170 -L 11111:jason-Precision-T7610:11111
Enter your password and then just leave the terminal open in the background.
On your local computer, open the same version of Paraview (but with a GUI, i.e. standard download). Click on the 'Connect' icon and click 'Add', type in a name (i.e. foamServer), click 'Configure'. In the next window click 'Save'. Then select the new connection and click 'Connect'.
After a few moments, you will see a local host connection establish in the pipeline browser. You now have access to your server's file system remotely. Navigate to the OpenFOAM case you want to explore and make sure there is an empty name_not_important.foam file in the root directory of the case. Click on the *.foam file and then select 'decomposedCase' in the Case type field. You should now be viewing your data in parallel using your server's resources to do the rendering!