Porting Boris Particle Pusher to DPC++. Performance Analysis and Optimization on Intel CPUs and GPUs

The talk reports the results of porting one of the key computational kernels of the High-Intensity Collisions and Interactions (Hi-Chi) open-source numerical code to the DPC++ programming language. When planning to port the entire Hi-Chi code to DPC++, we started with a Boris particle pusher to assess the effort required for porting and the expected performance on state-of-the-art Intel CPUs and GPUs. We found that the optimized parallel C++ code can be ported relatively easily to DPC++, while performance on Xeon Platinum 8260L differs only slightly (~10% on average) from the baseline C++ code, which is a reasonable price to pay for portability. It turned out that the code originally optimized for CPUs can now be run on Intel GPUs (P630 and Iris Xe Max), demonstrating the expected performance, which can be improved later. In this talk, we will tell about the experience of porting, optimization techniques, and current results of code optimization for Intel GPUs.

Download Presentation Deck