HipSYCL’s Quest for Universal SYCL Binaries: One Binary from NVIDIA and AMD to Ponte Vecchio

Software vendors distributing SYCL binaries to end users with unknown hardware configurations commonly wish for a single binary to be able to run on any hardware.


In current SYCL implementations such as DPC++ or hipSYCL, constructing such “universal” binaries can be an arduous task: Their multipass compilation model requires parsing the code separately for each targeted backend, and once per target device in the case of AMD hardware. This can inflate compile times to impractical levels.


In this talk, we discuss hipSYCL’s new generic single-pass compiler, which can generate a universal binary while parsing the source code only a single time. With runtime performance levels comparable to current compilers, the new compiler can substantially outperform other SYCL compilers in terms of compile time, and provides instant binary portability across NVIDIA, AMD and Intel GPUs – including Ponte Vecchio. To this end, we also present the very first hipSYCL performance numbers on Ponte Vecchio.