oneAPI DevSummit at ISC 2021

June 22, 2021

Join us for the oneAPI Developer Summit at ISC focused on oneAPI and Data Parallel C++ for accelerated computing across xPU architectures (CPU, GPU, FPGA, and other accelerators). In this two-day LIVE virtual conference, you will learn from leading industry and academia speakers who are working on innovative cross-platform, multi-vendor architecture oneAPI solutions. – Collaborate from fellow developers and connect with other innovators. – Dive into a hands-on session where you will learn and apply optimizations in order to fully exploit device capabilities on CPU’s and GPU’s. – Join a vibrant community supporting each other using oneAPI and Data Parallel C++.

At ISC21, join Intel to learn how our unique, XPU-centric portfolio and oneAPI is helping users redefine the limits of what’s possible. In addition to oneAPI dev summit, do not forget to check out our new Intel® HPC + AI Pavilion with over 20 tech talks, fireside chats and demos, plus 2 executive keynotes, you’ll get the latest news on 3rd Gen Intel® Xeon® Scalable processors, Distributed Asynchronous Object Storage (DAOS), Intel® Optane™ technology, Xe-HPC (Ponte Vecchio), and much more.

ISC at Intel. IXPUG

Agenda

Schedule: June 22

Schedule: June 23

Schedule: June 22

10:00 - 10:10 CET

Introduction

Introduction Download Presentation Deck

LEARN MORE

Presenting

Sujata Tibrewala

Sujata Tibrewala is oneAPI Worldwide Developer Community manager at Intel who defines programs to enable developer community to use oneAPI. She is a co-chair for IEEE Edge Automation Platform Roadmap and is a frequent presenter at various IEEE and industry conferences. She has held positions of Director at Silicon Valley Engineering Council and TSC chair for Documentation Akraino. She is also a self taught artist who has exhibited at various venues in US and India including University of Illinois Chicago, Life Force Arts Center, Lalit Kala Academy etc.

10:10 – 10:50 AM CET KEYNOTE

Experiences with adding SYCL support to GROMACS

KEYNOTE Experiences with adding SYCL support to GROMACS GROMACS is an open-source, high-performance molecular dynamics (MD) package primarily used for biomolecular simulations, accounting for ~5% of HPC utilization worldwide. Due to the extreme computing needs of MD, significant efforts are invested in improving the performance and scalability of simulations. Target hardware ranges from supercomputers to laptops of individual researchers and volunteers of distributed computing projects…

LEARN MORE

Presenting

Erik Lindahl

Erik Lindahl received a PhD from the KTH Royal Institute of Technology in 2001, and performed postdoctoral research at Groningen University, Stanford University and the Pasteur Institute. He is currently professor of Biophysics at Stockholm University, with a second appointment as professor of Theoretical Biophysics at the Royal Institute of Technology. Lindahl’s research is focused on understanding the molecular mechanisms of membrane proteins, in particular ion channels, through a combination of molecular simulations and experimental work involving cryo-EM and electrophysiology. He has authored some 130 scientific publications and is the recipient of an ERC starting grant. Lindahl heads the international GROMACS molecular simulation project, which is one of the leading scientific codes to exploit parallelism on all levels from accelerators and assembly code to supercomputers and distributed computing. He is co-director of the Swedish e-Science Research Center as well as the Swedish National Bioinformatics Infrastructure, and lead scientist of the BioExcel Center-of-Excellence for Computational Biomolecular Research. His research work has been awarded with the Prix Jeune Chercheur Blaise Pascal, the Sven and Ebba-Christian Högberg prize, and the Wallenberg Consortium North prize. Lindahl is currently the chair of the PRACE Scientific Steering Committee.

10:50 - 11:20 CET TECH TALK

AdePT project - Experience with porting particle transport simulation to oneAPI

TECH TALK AdePT project – Experience with porting particle transport simulation to oneAPI The AdePT project is an R&D activity led by CERN aiming to speed up the simulation of particle propagation in the detector regions dominated by electromagnetic physics. The project goals are to implement technical solutions to run particle transport on GPUs and to understand related portability/implementation/optimization issues. While the project currently targets…

LEARN MORE

Presenting

Daniel-Florin Dosaru

I am a master’s student in Computer Science at EPFL, and I am working on my master thesis as a Technical Student CERN in EP/SFT group. I would describe myself as a computer systems enthusiast, passionate about parallel programming, networking, and security.

11:20 - 11:30 CET BREAK

11:30 - 1:00 CET HANDS ON SESSION

Single source heterogeneous programming with Data Parallel C++ and SYCL 2020 features

HANDS ON SESSION Single source heterogeneous programming with Data Parallel C++ and SYCL 2020 features We will introduce oneAPI and Data Parallel C++ for heterogenous programming. We begin by introducing this technology as an extension to standard C++ which incorporates parallelism directly into the language using SYCL specifications. We will look at SYCL 2020 features like Unified Shared Memory, Sub-Groups and Reductions. We will work…

LEARN MORE

Presenting

Rakshith Krishnappa

Rakshith is a developer evangelist at Intel, focused on oneAPI, DPC++ and High Performance Computing. For the last 16 years he has worked on various Intel products including CPUs, GPUs, HPC products and Software solutions.

Praveen Kundurthy

Praveen Kundurthy is a Developer Evangelist at Intel with over 15 years of experience in software development and optimization on Intel platforms. In his current role, he works with universities and developers to help them learn and utilize oneAPI for their projects. He has expertise in C++, C#, and Python programing languages. Over the past few years at Intel, he has worked on topics spanning artificial intelligence, storage technologies, gaming, virtual reality and Android. Praveen has a Master’s Degree in Computer Engineering from Mississippi State University.

1:00 - 1:20 CET LUNCH

1:20 - 2:50 CET HANDS ON SESSION

Porting NAMD to oneAPI DPC++

HANDS ON SESSION Porting NAMD to oneAPI DPC++ The NAMD parallel molecular dynamics code is designed for high-performance simulation of large biomolecular systems; it is used for tackling biomedically relevant challenges, such as the coronavirus, by providing insight at the atomic level of detail. NAMD is an important application for the upcoming Aurora supercomputer at Argonne National Laboratory, which will be accelerated by Intel Ponte…

LEARN MORE

Presenting

Jaemin Choi

Jaemin Choi is a PhD candidate in Computer Science at the University of Illinois at Urbana-Champaign. His research is focused on efficient execution of asynchronous task-based programming models such as Charm++ on GPU-accelerated systems.

David Hardy

David Hardy is a senior research programmer at the University of Illinois at Urbana-Champaign. He leads the development of NAMD and was part of the research effort awarded in 2020 the ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research.

2:50 - 3:00 CET BREAK

3:00 - 3:40 CET VENDOR UPDATE

Experiences in Using oneAPI

VENDOR UPDATE Experiences in Using oneAPI

LEARN MORE

Presenting

Tom Deakin

Dr Tom Deakin is a Senior Research Associate in the High Performance Computing Research Group at the University of Bristol. His research interests include performance portability. Tom is the Chair of the Khronos SYCL Advisory Panel and member of the SYCL Working Group. More at hpc.tomdeakin.com.

Micheal Wong

Micheal Wong is the Vice President of Research and Development at Codeplay Software, a Scottish company that produces compilers, debuggers, runtimes, testing systems, and other specialized tools to aid software development for heterogeneous systems, accelerators and special purpose processor architectures, including GPUs and DSPs. He is now a member of the open consortium group known as Khronos, MISRA, and AUTOSAR and is Chair of the Khronos C++ Heterogeneous Programming language SYCL, used for GPU dispatch in native modern C++ (14/17), OpenCL, as well as guiding the research and development teams of ComputeSuite, ComputeAorta/ComputeCPP. For twenty years, he was the Senior Technical Strategy Architect for IBM compilers.

Aksel Alpay

Aksel Alpay is a researcher and software engineer from Heidelberg University, where he works on high performance computing topics. In particular, he is the creator and lead developer of the hipSYCL SYCL implementation, and also engages within the Khronos SYCL working group to advance the language.

James Reinders

James Reinders is a senior engineer who joined Intel Corporation in 1989 and has contributed to projects including systolic arrays systems WARP and iWarp, and the world’s first TeraFLOP supercomputer (ASCI Red), as well as compilers and architecture work for multiple Intel processors and parallel systems. James has been a driver behind the development of Intel as a major provider of software development products, and serves as their chief software evangelist. His most recent book is Data Parallel C++, Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL.

Kris Rowe

Kris is an Assistant Computational Scientist in the Performance Engineering Group at Argonne National Laboratory’s Leadership Computing Facility. He holds a PhD in applied mathematics from the University of Waterloo.

3:40 - 4:10 CET TECH TALK

Evaluating CUDA Portability with HIPCL and DPCT

TECH TALK Evaluating CUDA Portability with HIPCL and DPCT HIPCL is expanding the scope of the CUDA portability route from an AMD platform to an OpenCL platform. In the meantime, the Intel DPC++ Compatibility Tool (DPCT) is migrating a CUDA program to a data parallel C++ (DPC++) program. Towards the goal of portability enhancement, we evaluate the performance of the CUDA applications from Rodinia, SHOC,…

LEARN MORE

Presenting

Zheming Jin

I finished my Ph.D. in Computer Science and Engineering at University of South Carolina, where my research focused on program synthesis and applications development targeting FPGAs. As a postdoctoral appointee at Argonne, I evaluated high-level synthesis and oneAPI on FPGAs and GPUs, respectively. My research focus is heterogeneous computing.

4:10 - 4:40 CET TECH TALK

Design, Development and Validation of DPC++ backend for OCCA

TECH TALK Design, Development and Validation of DPC++ backend for OCCA OCCA—an open source, portable, and vendor neutral framework for programming parallel architectures—is used by the U.S. Department of Energy and Shell in major scientific and engineering applications. This talk will provide insight into the development of a DPC++ backend for OCCA. Integral to this effort is the DPC++ Unified Shared Memory (USM) model. Factors…

LEARN MORE

Presenting

Anoop Madhusoodhanan Prabha

Anoop is a Senior Technical Consulting Engineer at Intel specializing in Heterogeneous computing and code modernization. He holds a MS in Electrical Engineering from State University of New York at Buffalo.

Cedric Andreolli

Cedric is an Application Engineer working on codesign projects at Intel. His expertise includes optimization of HPC workloads as well as application characterization. He holds a master’s degree in Computer Science from INSA in France.

Saumil Sudhir Patel

Saumil is an Assistant Computational Scientist in the Computational Science Division at Argonne National Laboratory. He holds a PhD in Mechanical Engineering from The City College of New York.

Kris Rowe

4:40 – 5:10 AM CET TECH TALK

Porting oneAPI DPC++ on Xilinx FPGA & Versal ACAP CGRA

TECH TALK Porting oneAPI DPC++ on Xilinx FPGA & Versal ACAP CGRA Many accelerators comes with programming environment suitable for electrical engineers or usable with machine-learning frameworks but remain difficult to use in an HPC context. Fortunately SYCL 2020 can bring direct programming for various accelerators through the concept of generic back-ends. We are porting the open-source oneAPI DPC++ implementation to Xilinx Alveo FPGA cards and also…

LEARN MORE

Presenting

Ronan Keryell

Ronan Keryell is principal software engineer at Xilinx Research Labs. He works on SYCL C++-based programming models for heterogeneous system like FPGA and CGRA. He is the specification editor of the SYCL standard, member of the SYCL, SPIR & OpenCL standard committees from Khronos Group & ISO C++ committee. Ronan Keryell received his MSc in Electrical Engineering and PhD in Computer Science in 1992 from École Normale Supérieure of Paris & University of Paris Sud (France), on the design of a massively parallel RISC-based VLIW-SIMD graphics computer and its programming environment.

5:10 - 5:20 CET BREAK

5:20 - 5:50 CET KEYNOTE

TensorFlow and oneDNN in Partnership

KEYNOTE TensorFlow and oneDNN in Partnership Rapid growth in AI and machine learning innovations and workloads necessitates constant developments in both software and hardware infrastructure. TensorFlow, Google’s end-to-end open-source machine learning framework, and oneDNN have been collaborating closely to ensure users can fully utilize new hardware features and accelerators, with a focus on x86 architectures. This talk will cover recent projects such as int8 (AVX512_VNNI)…

LEARN MORE

Presenting

Penporn Koanantakool

Penporn Koanantakool is a senior software engineer at Google. She leads TensorFlow’s performance optimization collaboration with Intel. Penporn holds a Ph.D. in computer science from the University of California, Berkeley, and a B.Eng. in computer engineering from Kasetsart University, Thailand.

5:50 - 6:00 CET

Conclusion

Conclusion Download Presentation Deck

LEARN MORE

Presenting

Sujata Tibrewala

Pranati Tewari

Pranati Tewari joined Intel in 2011. She is a Product Marketing Engineer for Intel® oneAPI Priority Support. She assists customers and sales channel with information on product configuration, SKUs, pricing, and support. She is also responsible for product marketing of Intel® Graphics Performance Analyzers (Intel® GPA) and game development tools.

6:00 - 7:00 PM CET HAPPY HOUR

Happy Hour

Happy Hour We will Open the Happy hour with a fun Jeopardy game where you can show off your oneAPI and DPC++ knowledge and win some fun prizes including a book on DPC++. Then we will test your creativity with an exciting game of Scribl.io.

LEARN MORE

Presenting

Sujata Tibrewala

Russ Beutler

Russ Beutler is an Engagement Manager for oneAPI in the Developer Ecosystem Programs Team in the Intel Architecture Graphics and Software group. Previously he was the marketing manager for Intel® persistent memory and moderncode developer programs. He has over twenty-five years’ worldwide hardware and software marketing, consulting, and IT experience – twenty-one of which are at Intel.

Schedule: June 23

10:00 - 10:10 CET

Introduction

Presenting

Sujata Tibrewala

10:00 - 10:30 CET KEYNOTE

Porting Boris Particle Pusher to DPC++. Performance Analysis and Optimization on Intel CPUs and GPUs

KEYNOTE Porting Boris Particle Pusher to DPC++. Performance Analysis and Optimization on Intel CPUs and GPUs The talk reports the results of porting one of the key computational kernels of the High-Intensity Collisions and Interactions (Hi-Chi) open-source numerical code to the DPC++ programming language. When planning to port the entire Hi-Chi code to DPC++, we started with a Boris particle pusher to assess the effort…

LEARN MORE

Presenting

Iosif Meyerov

Iosif Meyerov obtained his Ph.D. in Applied Mathematics and Computer Science at the Lobachevsky State University of Nizhny Novgorod (2005). He is deputy head of the department of Mathematical software and supercomputing technologies (2009), head of the oneAPI Center of Excellence (2021). His research interests include high performance computing, performance analysis and optimization, computational science, scientific software development, sparse algebra algorithms.

Valentin Volokitin
Alexey Bashinov
Evgeny Efimenko
Arkady Gonoskov

10:30 - 11:00 CET TECH TALK

Bringing SYCL to pre-exascale supercomputing with DPC++ for CUDA

TECH TALK Bringing SYCL to pre-exascale supercomputing with DPC++ for CUDA Perlmutter is a pre-exascale supercomputer for NERSC at Lawrence Berkeley National Laboratory consisting of >6000 Nvidia A100 GPUs. Codeplay is working in partnership with NERSC, LBNL and Argonne National Laboratory to enable developers using this supercomputer and ALCF’s ThetaGPU machine to write highly accelerated applications using the SYCL programming model. This presentation will show…

LEARN MORE

11:00 - 11:10 CET BREAK

11:10 - 12:35 CET HANDS ON SESSION

Application optimization with Cache-aware Roofline Model and Intel oneAPI tools

HANDS ON SESSION Application optimization with Cache-aware Roofline Model and Intel oneAPI tools In this tutorial, we will introduce the Cache-aware Roofline Model (CARM) and expose its principles when modelling the performance of Intel CPU and GPU devices. We will also showcase how CARM implementation in Intel® Advisor can be used to drive the application optimization. For this purpose, we will rely on epistasis detection…

LEARN MORE

Presenting

Aleksandar Ilic

Aleksandar Ilic is an Assistant Professor at the Instituto Superior Técnico (IST), Universidade de Lisboa, and a Senior Researcher of the INESC-ID, Portugal. He contributed to more than 50 scientific publications. His research interests include high-performance and energy-efficient computing and modeling of heterogeneous systems.

Diogo Marques

Diogo Marques is a member of the HPCAS group at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID). His research interests include the modeling of multi-core and heterogeneous systems. His work contributed to improve the accuracy of Cache-aware Roofline Model, by proposing the memory metrics and scaled roofs presented in Intel Advisor.

Rafael Campos

Rafael Campos is a young researcher at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID), as part of the HPCAS group. His main interests are performance modeling of heterogeneous systems, with focus on performance optimization of bioinformatics applications and roofline modeling of high-performance heterogeneous CPU/GPU systems.

12:35 - 12:55 CET LUNCH

12:55 - 2:20 CET HANDS ON SESSION

Word-Count with MapReduce on FPGA, A DPC++ Example

HANDS ON SESSION Word-Count with MapReduce on FPGA, A DPC++ Example Many workloads have inherent data parallelism which can be leveraged to achieve optimal performance. However, it is challenging to design data parallel programs and map them to different hardware targets. Intel’s Data Parallel C++ is an open alternative for cross-architecture development, aiming to address this challenge. In this talk, we cover a popular distributed…

LEARN MORE

Presenting

Dr. Yan Luo

Dr. Luo is a Professor in the Department of Electrical and Computer Engineering at the University of Massachusetts Lowell. His research spans computer architecture, machine learning and data analytics. He teaches undergrad and graduate courses such as embedded systems and heterogeneous computing.

2:20 - 2:30 CET BREAK

2:30 - 3:00 CET TECH TALK

Performance-Portable Distributed k-Nearest Neighbors using Locality-Sensitive Hashing and SYCL

TECH TALK Performance-Portable Distributed k-Nearest Neighbors using Locality-Sensitive Hashing and SYCL In the age of AI, algorithms must efficiently cope with vast data sets. We propose a performance-portable implementation of Locality-Sensitive Hashing (LSH), an approximate k-nearest neighbors algorithm to speed up classification on heterogeneous hardware. Our new library provides a hardware-independent, yet efficient and distributed implementation of the LSH algorithm using SYCL and MPI. The…

LEARN MORE

Presenting

Marcel Breyer

Marcel Breyer is a PhD student at the University of Stuttgart, Germany. His main field of research is on performance portability on heterogeneous hardware, which includes new applications of SYCL. He has contributed performance-portable k-Nearest Neighbors implementations for vast data sets.

3:00 - 3:30 CET TECH TALK

Visualization of human-scale blood flow simulation using Intel OSPRay Studio on SuperMUC-NG

TECH TALK Visualization of human-scale blood flow simulation using Intel OSPRay Studio on SuperMUC-NG HemeLB is a highly scalable, 3D blood flow solver capable of generating high-resolution simulations of blood flow through human-scale vasculatures. Post-processing such simulations is a significant challenge, particularly due to the volume of data generated. We have utilized Intel OSPRay Studio to visualize generated data timeseries directly on the production machine…

LEARN MORE

Presenting

Elisabeth Mayer

Elisabeth Mayer has been working at the Leibniz Supercomputing Centre since 2017. Her research fields include Virtual Reality, rendering workflows and visualization of complex datasets. Here, she focuses on the development of a rendering workflow utilizing HPC resources to create 360° 3D videos.

Salvatore Cielo
Jon McCullough

Jon McCullough

Jon McCullough joined University College London in 2019 to continue the development of the HemeLB blood flow simulation code. This work has particularly focused on developing a self-coupled version of the code for simultaneous human-scale arterial and venous flow modelling.

Johannes Gunther
Peter Coveney

3:30 - 4:00 CET TECH TALK

Ginkgo - An Open Source Math Library in the oneAPI Ecosytem

TECH TALK Ginkgo – An Open Source Math Library in the oneAPI Ecosytem Ginkgo is an open-source math library designed for GPU-accelerated supercomputers. In this talk, we will present the path we took to prepare Ginkgo for Intel GPUs. We will start with reporting our experiences in porting the NVIDIA-focused software stack to Intel’s DPC++ environment and the obstacles we encountered when using automated code…

LEARN MORE

Presenting

Hartwig Anzt

Hartwig Anzt is a research group leader at the Steinbuch Centre for Computing at the Karlsruhe Institute of Technology (KIT). He obtained his PhD in Mathematics at the Karlsruhe Institute of Technology. Afterwards, he joined Jack Dongarra’s Innovative Computing Lab at the University of Tennessee in 2013 until he started his own research group in 2017. He still contributed to the Innovative Computing Lab as a Research Consultant. Hartwig Anzt has a strong background in numerical mathematics, specializes in iterative methods and preconditioning techniques for the next generation hardware architectures. His Helmholtz group on Fixed-point methods for numerics at Exascale (FiNE) is granted funding until 2022. Hartwig Anzt has a long track record of high-quality software development. He is author of the MAGMA-sparse open source software package and managing lead of the Ginkgo numerical linear algebra library. Hartwig Anzt is PI of the EuroHPC project MICROCARD, and a co-PI of the PEEKS project and the xSDK project inside the software technology effort of the US Exascale Computing Project (ECP). He is also the technical PI of the multiprecision effort in the xSDK project, a coordinated effort aiming at integrating low-precision functionality into high-accuracy simulation codes.

4:00 - 4:40 CET TECH TALK

HPC changing the paradigm of film animation as Tangent revolutionizes creative story telling

TECH TALK HPC changing the paradigm of film animation as Tangent revolutionizes creative story telling Jeff Bell, CEO of Tangent Labs, will present a unique Studio Customer story highlighting their evolution of the digital creative workflow, leveraging HPC in the cloud. A company with a strong focus on the creative community, is embracing Blender for production film making. This includes high fidelity animation films for…

LEARN MORE

Presenting

Jeff Bell

Jeff is one of those unique people who brings Art and Computer Science together. Part of the team that brought Alias | Wavefront’s Maya to market, Jeff has been a CG Supervisor, VFX Supervisor and Executive at IDT Entertainment and Starz Animation. Jeff was an integral contributor to memorable features including ‘Everyone’s Hero’, ‘Hoodwinked 2’, and Tim Burton’s ‘9’. Recent credits include ‘Ozzy’ and Netflix’s ‘Next Gen’, where Jeff became a trailblazer by producing the first animated feature created entirely in Open Source 3D content creation software Blender.

4:40 - 5:10 CET TECH TALK

Integrating Arhat deep learning framework with Intel® oneDNN library and OpenVINO™ toolkit

TECH TALK Integrating Arhat deep learning framework with Intel® oneDNN library and OpenVINO™ toolkit Arhat is a specialized deep learning framework that converts neural network descriptions into lean standalone executable code which can be directly deployed on various platforms. Arhat backend for Intel generates C++ code that directly calls oneDNN API functions and can run on any modern Intel CPU or GPU. Arhat includes the…

LEARN MORE

Presenting

Alexey Gokhberg

Alexey Gokhberg is a seasoned software engineer with 25+ years of experience in various industrial and academic branches. His professional interests include deep learning, high-performance computing, programming language construction, and computational geophysics.

5:10 - 5:15 CET BREAK

5:15 - 6:00 CET KEYNOTE

oneAPI, SYCL and Standard C++: Where Do We Need To Go From Here?

KEYNOTE oneAPI, SYCL and Standard C++: Where Do We Need To Go From Here? I first discovered the joys of C++ in 1988, and joined its standardization effort two decades later to represent developers. Working at Argonne on the oneAPI/DPC++/SYCL backend for Kokkos, I see much of what HPC and heterogeneous computing developers need (both as one as well as having to support others), and…

LEARN MORE

Presenting

Nevin Liber

Nevin Liber is a computer scientist in the ALCF (Argonne Leadership Computing Facility) division of Argonne National Laboratory, where he works on the oneAPI/DPC++/SYCL backend for Kokkos for Aurora. He also represents Argonne on the SYCL and C++ Committees, the latter as Vice Chair of LEWGI/SG18.

6:00 - 6:10 CET

Conclusion

Conclusion Download Presentation Deck

LEARN MORE

Presenting

Sujata Tibrewala

Dr. Thomas Steinke

Thomas Steinke is head of the Supercomputing dept. at the Zuse Institute Berlin (ZIB) and responsible for HPC research, consulting and operation. His interest is in heterogeneous systems with innovative processor and memory designs for scientific applications, and parallel simulation methods.

6:10 - 7:00 PM CET HAPPY HOUR

Happy Hour

LEARN MORE

Presenting

Sujata Tibrewala

Russ Beutler