Codeplay’s Contribution to oneAPI by supporting SYCL, oneDNN and oneMKL for NVIDIA GPUs

Andrew Richards, founder and CEO of Codeplay and oneAPI industry initiative member, pioneers GPU acceleration technologies.

Enabling software developers to write code once, and then tune it for multiple accelerator platforms, is the holy grail for the high-performance computing (HPC) and supercomputing industry.  Such a breakthrough would eliminate writing separate code for multiple accelerator platforms, a time-consuming and costly exercise that takes talented developers away from working on critical projects.

To solve this challenge, the oneAPI industry initiative launched in 2018.  With support from leaders in HPC, artificial intelligence (AI) as well as those at original equipment manufacturers (OEMs), independent software vendors (ISVs), hardware vendors, cloud service providers (CSPs), and at universities and other organizations, the initiative brought together a vast, diverse amount of brainpower and expertise.  This made it possible to produce a cross-platform industry standard to help developers maximize computing power for exascale-class workloads on heterogeneous hardware.

As a member of the oneAPI industry initiative and the SYCL community since its inception, CEO Andrew Richards and his team at Codeplay collaborated with colleagues across the industry to define the SYCL standard.  And, in 2020, as part of its work within the Initiative, Codeplay announced its contribution to oneAPI by supporting SYCL, oneDNN and oneMKL for NVIDIA GPUs. 

 “When Intel’s implementation of SYCL – known as DPC++ with extensions for CPUs, GPUs and FPGAs – became available, it offered us an opportunity to fully support NVIDIA GPUs and to integrate them into the LLVM compiler,” said Richards.

This meant developers could more easily code for NVIDIA GPUs without utilizing OpenCL.  The codebase for this implementation lives in the main DPC++ LLVM Compiler project.

Codeplay offers a oneAPI implementation framework that targets NVIDIA GPUs without intermediate layers, exposing the full performance of the underlying hardware.

The Codeplay implementation uses the native CUDA interface, enabling developers to gain portability and performance.  The interface is available for both DPC++ and the MKL-BLAS linear algebra math library, which is part of the oneMKL math framework, as well as for oneDNN. 

“Accelerating devices in this way, by using standards that can run on numerous platforms and devices frees developers to make accelerator choices based on what works best for the overall solution,” said Richards. 

The benefits of using a DPC++ compiler, which offers a standards-based, open-source model go beyond lower development costs for heterogeneous programming.  It enables faster application performance, more productivity, and greater innovation.

Cross Platform, from Concept to Industry Deployment

Codeplay’s implementation of the oneAPI spec led to two recent contracts that enable scientific organizations to take advantage of SYCL standards and complier capabilities for NVIDIA and other GPUs.

Announced in February 2021, the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (Berkeley Lab), in collaboration with the Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory, signed a contract with Codeplay to enhance the LLVM SYCL GPU compiler capabilities for NVIDIA A100 GPUs in order to power NERSC’s next-generation supercomputer, Perlmutter.

The collaboration will help NERSC and ALCF users, as well as HPC community at large, produce high-performance applications that are portable across multiple compute architectures from a range of vendors.

NERSC supercomputers are used for scientific research in medicine, alternative energy, environment, high-energy and nuclear physics, advanced computing, materials science, and chemistry.  During the past year, 20 research teams ran COVID-19 simulations to perform analysis and develop solutions.

ALCF supercomputers provide supercomputing resources and expertise that enable researchers to take advantage of simulation, data science, and machine learning methods.  ALCF supercomputers over the past year also helped accelerate the development of treatments and strategies designed to overcome the COVID-19 pandemic.

A similar agreement was announced in June 2021 between the Argonne National Laboratory (ANL) in collaboration with Oak Ridge National Laboratory (ORNL) and Codeplay.  The contract is to implement the oneAPI DPC++ compiler to support AMD GPU-based high-performance compute (HPC) supercomputers.

Argonne’s exascale supercomputer, Aurora, is based on Intel GPUs with SYCL being one of the primary programming models. While the Oak Ridge exascale supercomputer, Frontier, features AMD GPUs.

Exascale supercomputers process 1018, or 1 quintillion, calculations per second, which is more than 150 Petaflops.  This makes them the highest performance computers in the world.  Both the Argonne and Oak Ridge labs are registered as U.S. Department of Energy’s (DoE) Office of Science User Facilities.

Supporting the SYCL open standard on Frontier will help expedite the development of scientific applications for heterogeneous compute environments across DoE national labs.  By providing code portability across multiple supercomputing systems, this will enable application scientists to leverage existing software assets and extend their development capabilities in HPC and AI.

At both the Argonne and Oak Ridge facilities, exascale supercomputers are also used to conduct scientific research in medicine, alternative energy, environment, high-energy and nuclear physics, advanced computing, materials science, and chemistry.  This work will help advance science computing through simulation, data science, and machine learning methods.

Fireside Chats with Andrew Richards

The oneAPI industry initiative encourages cooperation among individuals at numerous organizations and companies.  The chief topics of collaboration include the oneAPI specification and compatible oneAPI implementations across the ecosystem.  As a proponent of the initiative, Andrew Richards frequently speaks with people at different companies that are innovating in this area.

Video:  Andrew Richards of Codeplay talks with James Reinders of Intel on the DPC++ Compiler for the LLVM implementation of SYCL.

Watch Richard’s Fireside Chats – one with Intel’s James Reinders and another with Google’s Penporn Koanantakool – discussing how the oneAPI industry initiative helps enhance work with multiple processors and devices to increase performance.

Video:  Andrew Richards of Codeplay talks with Penporn Koanantakool of Google on integrating oneAPI into TensorFlow.

Get Involved and Review the oneAPI Specification

Learn about the latest oneAPI updates, industry initiative and news.  Check out our videos and podcasts.  Visit our GitHub repo – review the spec and give feedback or join the conversation happening now on our Discord channel.  Then get inspired, network with peers and participate in oneAPI events.



Learn about joining the UXL Foundation:

Join now