oneAPI DevSummit at SC 2021

November 14, 2021 | 9 a.m.–6:30 p.m. CT

Join us for hands-on tutorials, tech talks, and panels spanning the oneAPI programming model, AI analytics, performance analysis tools and libraries with global Industry experts from Berkeley, Argonne, NASA, Codeplay, University of Lisbon, University of Edinburg and more. Get the latest information on Intel® oneAPI Toolkits since their initial production release in late 2020.

Agenda

DPC++

AI Analytics/ FPGA

Libraries

DPC++

9:00 - 9:15 AM CT

Introduction/Opening

WATCH

9:15 - 10:00 AM CT

Global Experts on eXtreme Performance Panel

The Intel eXtreme Performance Users Group (IXPUG) will present a brief overview of the organization and its activities, followed by a panel discussion focused on the expected adoption, support and application of oneAPI at various computing sites around the world. Experts from the various sites will discuss ongoing work with oneAPI and plans for its support and application at their site, in addition to elucidating…

WATCH

Presenting

R. Glenn Brook

R. Glenn Brook is a Senior Solution Architect at Cornelis Networks. Prior to March of 2021, he was an employee of The University of Tennessee (UT) for 18 years, most recently within the Joint Institute for Computational Sciences (JICS) between UT and Oak Ridge National Laboratory. At JICS, Glenn served as the Chief Technology Officer and the Director of the Application Acceleration Center of Excellence. He also served as the Principal Investigator for the NSF-funded Beacon project, which deployed and operated the accelerated computing system ranked #1 on the November 2012 Green500 list. Glenn is an adjunct faculty member at The University of Southern Mississippi and the current President of the Intel eXtreme Performance Users Group. He received his PhD in Computational Engineering from UT in 2008. He resides in Oak Ridge, TN.

David Martin

David Martin is Manager, Industry Partnerships and Outreach at the Argonne Leadership Computing Facility at Argonne National Laboratory, where he works with industrial users to harness high performance computing and take advantage of the transformational capabilities of modeling and simulation. David brings broad industry and research experience to ALCF. Prior to joining ALCF, David led IBM’s integration of internet standards, grid and cloud computing into offerings from IBM’s Systems and Technology Group. Before IBM, David managed networks and built network services for the worldwide high-energy physics community at Fermilab. David began his career at AT&T Bell Laboratories, doing paradigm-changing work in software engineering and high-speed networking. David has a BS from Purdue and an MS from the University of Illinois at Urbana-Champaign, both in Computer Science.

Dr. Thomas Steinke

Thomas Steinke is head of the Supercomputing dept. at the Zuse Institute Berlin (ZIB) and responsible for HPC research, consulting and operation. His interest is in heterogeneous systems with innovative processor and memory designs for scientific applications, and parallel simulation methods.

Joseph Curley

Joseph (Joe) Curley serves Intel Corporation as Senior Director, oneAPI Products, Solutions & Ecosystem. His primary responsibilities include supporting the oneAPI industry initiative, product management of Intel’s oneAPI product implementation, and supporting the oneAPI developer ecosystem. Mr. Curley joined Intel Corporation in 2007, and has served in multiple other strategic planning, ecosystem development, and business leadership roles. Prior to joining Intel, Joe worked at Dell, Inc. leading the global workstation product line, the consumer and small business desktop product line, and in a series of engineering roles. He began his career at computer graphics pioneer Tseng Labs.

James Reinders

James Reinders is an engineer at Intel focused on enabling parallel programming in a heterogeneous world. James has helped create ten technical books related to parallel programming; his latest book is about SYCL (free download: https://www.apress.com/book/9781484255735). He has had the great fortune to help make key contributions to two of the world’s fastest computers (#1 on Top500 list) as well as many other supercomputers, and software developer tools.

10:30 - 10:45 AM CT Break

10:45 - 11:45 AM CT

Developing for Nvidia GPUs using SYCL with oneAPI

Support from the community for SYCL is growing, with some of the most powerful supercomputers in the world (including Aurora, Perlmutter and Frontier) adopting the programming model for cutting edge research. By migrating your code from CUDA to SYCL it’s not only possible to still target Nvidia GPUs, but it’s also possible to deploy to a wider set of GPUs from different companies including Intel…

WATCH

Presenting

Joe Todd

Joe Todd is a Senior Software Engineer at Codeplay with a decade of experience developing parallel software. Joe’s career began with a PhD in glaciology at the University of Cambridge, during which he implemented an ice-fracture extension to the parallel finite element model Elmer FEM. Subsequently he spent several years as a post-doctoral researcher at the University of St Andrews & the University of Edinburgh, developing a massively-parallel particle-based fracture model & a Bayesian framework for uncertainty propagation through ice sheet models. Most recently, driven by a desire to focus on software engineering, Joe left academia to join Codeplay, where he works on SYCL’s CUDA backend, with a continuing focus on particle simulators.

11:45 - 12:45 PM CT Lunch

12:45 - 1:15 PM CT

Experience in Moving CUDA Optimized FUN3D Kernels to Intel GPUs using Intel OneAPI

This presentation provides an overview of recent efforts to port existing CUDA kernels relevant to unstructured-grid computational fluid dynamics to the oneAPI framework for execution on Intel GPUs. Differences between the programming models are examined and ongoing challenges are discussed. Download Presentation Deck

WATCH

Presenting

Eric Nielson

Senior Research Scientist

Mohammad Zubair

Professor, Department of Computer Science

1:15 - 1:45 PM CT

Acceleration of Integrated Circuit Simulation using SYCL and oneAPI

Simulation of integrated circuits consists of solving matrix-based equations. As the size of the modern circuits increases, the computation time and resources for a simulation have significantly increased. The recent progress in heterogenous hardware platforms has created an opportunity to increase the efficiency of these simulations. In this project, we demonstrate the acceleration of LU decomposition as the core algorithm in solving circuits using SYCL…

WATCH

Presenting

Danial Chitnis

Danial is a Chancellor’s Fellow and associate professor in Electronics at the University of Edinburgh, Scotland. His research interest is modelling Electronic and Photonic systems.

Finlay Marno

Finlay just completed his MSc at the University of Edinburgh, during which he developed his first GPGPU/SYCL program.

1:45 - 2:15 PM CT Break

2:15 - 2:45 PM CT

Performance of DPC++ on Representative Structured/Unstructured Mesh

In this session we will give an overview of performance achieved with DPC++ on Intel server CPUs on MG-CFD, an unstructured-mesh CFD mini-app, and OpenSBLI, a structured mesh academic CFD code. We will contrast results to OpenMP implementations and explore key differences and bottlenecks based on VTune and Advisor feedback. Download Presentation Deck

WATCH

Presenting

István Z. Reguly

István Reguly is an Associate Professor at PPCU, focusing on domain specific languages for high performance computing. He primarily works on the performance, portability and productivity of the discretized solution of partial differential equations.

2:45 - 3:15 PM CT

Enabling NAMD for Intel Xe

NAMD is a prominent parallel molecular dynamics application designed for high performance computing of large biomolecular systems. This session focuses on the development of NAMD for Intel GPUs using oneAPI/DPC++ by porting the efficient NAMD CUDA implementation and improving it with flexible vectorization for portable performance. We will also discuss the implementation in NAMD of relative debugging techniques across architectures and programming languages. Download…

WATCH

Presenting

David Hardy

David Hardy is a senior research programmer at the University of Illinois at Urbana-Champaign. He leads the development of NAMD and was part of the research effort awarded in 2020 the ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research.

Tareq Malas

Tareq joined Intel after his postdoctoral fellowship in Lawrence Berkeley National Laboratory under the NERSC Exascale Science Applications Program. He obtained his MS and Ph.D. degrees from King Abdullah University of Science and Technology (KAUST), in Saudi Arabia, advised by Prof. David Keyes in the Extreme Computing Research Center. His main areas of research are High Performance Computing in stencil computations and molecular dynamics simulations. He is interested in developing efficient high-performance computing algorithms on contemporary and future architectures for the most demanding applications. He likes to work near the CPU, as he did his Stencil code generation project in performing efficient vectorization in the CPU of the PowerPC 450 processor of the Blue Gene/P supercomputer. He worked on developing novel cache blocking techniques for reducing the data movement in the processor’s memory hierarchy, allowing the use of cache blocks that can efficiency span multiple cache domains. This work was developed in his Girih project for Intel® CPU. He is currently working on the performance optimizations of molecular dynamics simulations.

3:15 - 3:30 PM CT Break

3:30 - 4:00 PM CT

Performance portability and evaluation of heterogeneous components of SeiSol targeted to upcoming Intel HPC GPUs

We will present our recent results of integrating oneAPI programming model into SeisSol, a software package for simulating seismic waves and earthquake dynamics. During the talk, we are going to demonstrate a set of comparisons of various SeisSol specific benchmarks compiled and executed with oneAPI, hipSYCL, and CUDA. At the end, we are going to present performance of the whole application obtained with 2 Nvidia…

WATCH

Presenting

Ravil Dorozhinskii

Ravil Dorozhinskii is a PhD candidate at Technical University of Munich. His current research topic is focused on heterogeneous computing in Computational Seismology as part of the ChEESE project. Ravil obtained his first degree in Engineering at Bauman Moscow State Technical University with a focus on Numerical Methods and Simulations. After a few years of working in industry, he joined Computational Science and Engineering Master’s program at Technical University of Munich and focused on High Performance Computing. Ravil is working on generating GPU code for SeisSol and his current research interest is in Compiler Design and Construction. As a part of Ravil’s PhD study, he is involved in teaching HPC Algorithms and Applications as well as 3D Game Physics.

4:00 - 4:30 PM CT

Enhancing Online Planning on low-power CPU-GPU SoCs via Bloom Filter Based Memory

This work proposes a new design for online planning for intelligent agents modelled as POMDPs. We introduce an online planner enhanced with Bloom filter memory which we implement and evaluate on a low-power CPU+GPU SoC. Using the DPC++ parallel execution model of the most computing-intensive kernel of our Bloom filter implementation, we reduce the overall planning time by 3.5x to 7.5x for three representative benchmarks…

WATCH

Presenting

Denisa Constantinescu

Denisa Constantinescu is a Ph.D. student in Mechatronics and a researcher in the Computer Architecture Department at the University of Malaga. She obtained a Master’s degree in Computer Engineering from the University of Malaga in 2017. She was a Research Visitor at the NUCAR Laboratory (Northeastern University, Boston, USA) in 2018. Her research interests are in parallel computing, robotics, intelligent control systems, optimization, and autonomous decision-making.

Rafael Asenjo

Rafael Asenjo is a Professor of Computer Architecture at the University of Malaga. He has been using TBB since 2008 and over the last ten years, he has focused on productively exploiting heterogeneous chips (CPU+GPU+FPGA) leveraging TBB as the orchestrating framework. He has co-authored the latest book on TBB and is member of the SYCL Advisory Panel.

4:30 - 5:30 PM CT

The oneAPI Software Abstraction for Heterogeneous Computing

oneAPI is a cross-industry, open, standards-based unified programming model. The oneAPI specification extends existing developer programming models to enable a diverse set of hardware through language, a set of library APIs, and a low-level hardware interface to support cross-architecture programming. It builds upon industry standards and provides an open, cross-platform developer stack to improve productivity and innovation. At the core of oneAPI is the DPC++…

WATCH

Presenting

Sujata Tibrewala

Sujata Tibrewala is oneAPI Worldwide Developer Community manager at Intel who defines programs to enable developer community to use oneAPI. She is a co-chair for IEEE Edge Automation Platform Roadmap and is a frequent presenter at various IEEE and industry conferences. She has held positions of Director at Silicon Valley Engineering Council and TSC chair for Documentation Akraino. She is also a self taught artist who has exhibited at various venues in US and India including University of Illinois Chicago, Life Force Arts Center, Lalit Kala Academy etc.

Jin Zheming

Zheming finished his Ph.D. in Computer Science and Engineering at University of South Carolina, where his research focused on program synthesis and applications development targeting FPGAs. As a postdoctoral appointee at Argonne, he evaluated high-level synthesis and oneAPI on FPGAs and GPUs, respectively. His current work for oneAPI is developing programs for the development of SYCL compilers.

Aksel Alpay

Aksel Alpay is a researcher and software engineer from Heidelberg University, where he works on high performance computing topics. In particular, he is the creator and lead developer of the hipSYCL SYCL implementation, and also engages within the Khronos SYCL working group to advance the language.

Kumudha Krishnamurthy Narasimhan

Kumudha Narasimhan is a Senior Software Engineer at Codeplay. Her work focuses on optimizing compilers and middleware to provide performance and portability on various heterogenous compute systems. She joined Codeplay in 2019 and has worked on many projects in this space – adding Nvidia backend for Intel oneMKL & oneDNN and optimizing various neural networks using glow compiler for proprietary custom hardware, to highlight a few. She presently leads the team which focuses on optimizing AI and DNN computations for different architectures. Prior to Codeplay, she received her Masters in Computer Science from Indian Institute of Science (India) where she worked on a domain specific language to optimize linear algebra computations and multigrid methods using polyhedral compilation techniques.

Ronan Keryell

Ronan Keryell is principal software engineer at Xilinx Research Labs. He works on SYCL C++-based programming models for heterogeneous system like FPGA and CGRA. He is the specification editor of the SYCL standard, member of the SYCL, SPIR & OpenCL standard committees from Khronos Group & ISO C++ committee. Ronan Keryell received his MSc in Electrical Engineering and PhD in Computer Science in 1992 from École Normale Supérieure of Paris & University of Paris Sud (France), on the design of a massively parallel RISC-based VLIW-SIMD graphics computer and its programming environment. He was co-founder of 3 start-ups, mainly in high-performance computing, was the technical lead of the Par4All automatic parallelizer at SILKAN, targeting OpenMP, CUDA & OpenCL from sequential C & Fortran. Before joining Xilinx, he worked at AMD on programming models for GPU.

James Reinders

5:30 - 6:30 PM CT

Happy Hour

AI Analytics/ FPGA

9:00 - 9:15 AM CT

Introduction/Opening

WATCH

9:15 - 10:00 AM CT

Global Experts on eXtreme Performance Panel

WATCH

Presenting

R. Glenn Brook

David Martin

Dr. Thomas Steinke

Joseph Curley

James Reinders

10:30 - 10:45 AM CT Break

11:45 - 12:45 PM CT Lunch

12:45 - 1:15 PM CT

Spatial DPC++ constructs for algorithm acceleration with FPGAs

Field programmable gate arrays (FPGAs) have gained increasing mindshare as an architecture through which workloads can be accelerated in a power-efficient way, particularly when existing accelerators aren’t tuned for or well matched with a workload of interest. They allow a custom architecture to be built for the algorithm of interest without resorting to costly ASIC design, and therefore bridge a gap in performance between a…

WATCH

Presenting

Michael Kinsner

Mike Kinsner is a Principal Engineer at Intel Corporation developing languages and parallel programming models for a variety of compute architectures. He is one of the architects of Data Parallel C++. He started his career at Altera, working on high level synthesis for field programmable gate arrays, and still contributes to spatial programming models and compilers. Mike is an Intel representative within The Khronos Group standards organization, where he works on the SYCL and OpenCL open industry standards for parallel programming. Mike holds a Ph.D. in Computer Engineering from McMaster University, and recently co-authored the industry’s first book on SYCL and Data Parallel C++.

1:15 - 1:45 PM CT

oneAPI AI Analytics – End to End

Using an end-to-end machine learning platform to build and deploy Intel AI models at scale. Bridge science and engineering teams in a clear and collaborative machine learning management environment in which communicate and reproduce results with interactive workspaces, dashboards, dataset organization, experiment tracking and visualization, a model repository and API to consume them. All possible through a unique open source Platform, cnvrg.io and Intel AI…

LEARN MORE

Presenting

Antimo Musone

Executive Manager at EY in Digital Technology Transformation for AI & Cloud, I’m Disruptive Digital Innovator with a full vision of Technology, People, Processes, and Economics Features. Senior Manager and Solution Architect with 15-years of IT experience and proven expertise in governance, project management, solution design, and deal shaping of enterprise-wide software solutions.

1:45 - 2:15 PM CT Break

2:15 - 2:45 PM CT

Using Arhat framework with Intel® oneDNN library and OpenVINO™ toolkit for object detection applications

Arhat is a cross-platform deep learning framework that converts neural network descriptions into lean standalone executable code. This approach provides significant benefits because of a simple and straightforward deployment process. Arhat is integrated with Intel oneAPI deep learning libraries. Arhat backend for Intel generates C++ code that directly calls oneDNN API. Furthermore, Arhat provides a module that consumes models produced by the OpenVINO Model Optimizer.…

WATCH

Presenting

Alexey Gokhberg

Alexey Gokhberg is a seasoned software engineer with 25+ years of experience in various industrial and academic branches. His professional interests include deep learning, high-performance computing, programming language construction, and computational geophysics.

2:45 - 3:15 PM CT

The Great CEED Bake-off: DPC++ Edition

The CEED Bake-off Problems are a collection of benchmarks representing important compute-intensive kernels and solvers relevant to high-order finite and spectral element methods, such as those used in the Nek5000 CFD code. In this talk we present a DPC++ implementation of the CEED Bake-off Problems. Benchmark results are given for Intel CPUs and GPUs. Intel Advisor is used to conduct cache-aware roofline analysis and understand…

WATCH

Presenting

Kris Rowe

Kris is an Assistant Computational Scientist in the Performance Engineering Group at Argonne National Laboratory’s Leadership Computing Facility. He holds a PhD in applied mathematics from the University of Waterloo.

Saumil Sudhir Patel

Saumil is an Assistant Computational Scientist in the Computational Science Division at Argonne National Laboratory. He holds a PhD in Mechanical Engineering from The City College of New York.

3:15 - 3:30 PM CT Break

3:30 - 3:45 PM CT

Accelerating Deep Learning with Intel Extension for PyTorch: a MedMNIST Classification Decathlon example

We showcase how to use Intel Extension for PyTorch (IPEX) for training and inference on the MedMNIST datasets, a collection of 10 MNIST-like open datasets on various medical imaging classification tasks such as pathology images, chest x-ray, OCT images. The demo runs on the Intel DevCloud for oneAPI on Ice Lake. We compare the performance with stock PyTorch and observe the performance gain that Intel…

WATCH

Presenting

Severine Habert

Séverine Habert is a Deep Learning Software Engineer at Intel who helps data scientists to use Intel AI Software tools. She holds a PhD in medical imaging from Technical University of Munich.

3:45 - 4:00 PM CT

Inference with ArrayFire and oneAPI

Session will demonstrate a simple ML inference pipeline using the OpenCL interop of oneAPI. The ArrayFire library and the derivative Flashlight project will be introduced and used as motivating examples. Data will flow from the oneAPI Video Processing Library to these existing libraries as an example of integrating oneAPI with existing GPU codebases. Download Presentation Deck

WATCH

Presenting

Stefan Yurkevitch

Stefan is a software engineer at ArrayFire where he helps maintain the ArrayFire library and obsesses over all things high-performance.

4:00 - 4:30 PM CT

Edge Intelligence and Its application in CAVs

The proliferation of Internet of Things and the success of rich cloud services have pushed the horizon of a new computing paradigm, Edge Computing, which calls for processing the data at the edge of the network. Edge computing has the potential to address the concerns of response time requirement, battery life constraint, bandwidth cost saving, as well as data safety and privacy. In this talk,…

WATCH

Presenting

Yongtao Yao

Yongtao Yao is a Ph.D. student at Wayne State University, Detroit, MI, USA, under the supervision of Prof. Weisong Shi. His research direction is edge computing, and his specific research interests include model scheduling and video processing.

Weisong Shi

Dr. Weisong Shi is the Associate Dean for Research and Graduate Studies at College of Engineering, Wayne State University. He is a Charles H. Gershenson Distinguished Faculty Fellow and a Professor of Computer Science, and leads the Wayne Mobility Initiative (WMI) and directs the Mobile and Internet Systems Laboratory and Connected and Autonomous Driving Laboratory, investigating performance, reliability, power- and energy-efficiency, trust and privacy issues of networked computer systems and applications. He serves as the director of the NSF IUCRC on electric, connected, and autonomous technologies for mobility (eCAT, planning). Dr. Shi is one of the world leaders in the edge computing research community, his pioneer paper entitled “Edge Computing: Vision and Challenges” has been cited more than 3700 times. Dr. Shi has been actively involved in the activities of IEEE Computer Society. He had served as the Chair of the Technical Committee on the Internet (TCI) during 2012-2016. Dr. Shi is the founding steering committee chair of ACM/IEEE Symposium on Edge Computing (SEC), and International Conference on Connected and Autonomous Driving (MetroCAD). He is an IEEE Fellow and an ACM Distinguished Scientist. More information can be found at http://weisongshi.org

4:30 - 5:30 PM CT

The oneAPI Software Abstraction for Heterogeneous Computing

WATCH

Presenting

Zheming Jin

I finished my Ph.D. in Computer Science and Engineering at University of South Carolina, where my research focused on program synthesis and applications development targeting FPGAs. As a postdoctoral appointee at Argonne, I evaluated high-level synthesis and oneAPI on FPGAs and GPUs, respectively. My research focus is heterogeneous computing.

Aksel Alpay

Kumudha Krishnamurthy Narasimhan

Ronan Keryell

James Reinders

Moderator: Sujata Tibrewala

Sujata Tibrewala

Libraries

9:00 - 9:15 AM CT

Introduction/Opening

WATCH

9:15 - 10:00 AM CT

Global Experts on eXtreme Performance Panel

WATCH

Presenting

R. Glenn Brook

David Martin

Dr. Thomas Steinke

Joseph Curley

James Reinders

10:30 - 10:45 AM CT Break

10:45 - 11:45 AM CT

Multi-GPU Programming - Scale-Up and Scale-Out made easy, using the Intel MPI Library

For shared memory programming of GPGPU systems, users either have to manually run their domain decomposition along available GPUs as well as GPU Tiles. Or leverage implicit scaling mechanisms that transparently scale their offload code across multiple GPU-Tiles. The former approach can be cumbersome and the latter approach is not always the best performing one. The Intel MPI library can take that burden from users…

WATCH

Presenting

Michael Steyer

Michael Steyer is a HPC Technical Consulting Engineer, supporting technical and high performance computing segments within the Software and Advanced Technology Group at Intel.

Dimitry Durnov

Dmitry is Intel MPI and oneCCL products architect at Intel.

Anatoliy Rozanov

Anatoliy is Intel MPI Lead Developer responsible for Intel GPU enabling and Intel MPI process management/deployment infrastructure at Intel.

11:45 - 12:45 PM CT Lunch

3:30 - 4:00 PM CT

Accelerating epistasis detection on Intel CPUs and discrete GPUs with Intel® Advisor

In this tutorial, we will introduce the Cache-aware Roofline Model (CARM) and expose its basic principles when modelling the performance upper-bounds of Intel CPU and GPU devices. For this purpose, we will rely on epistasis detection as a case-study, which is an important application in bioinformatics. By using DPC++ to deploy the application in Intel Iris Xe MAX (DG1), we will show how Intel® Advisor…

WATCH

Presenting

Aleksandar Ilic

Aleksandar Ilic is an Assistant Professor at the Instituto Superior Técnico (IST), Universidade de Lisboa, and a Senior Researcher of the INESC-ID, Portugal. He contributed to more than 50 scientific publications. His research interests include high-performance and energy-efficient computing and modeling of heterogeneous systems.

Diogo Marques

Diogo Marques is a member of the HPCAS group at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID). His research interests include the modeling of multi-core and heterogeneous systems. His work contributed to improve the accuracy of Cache-aware Roofline Model, by proposing the memory metrics and scaled roofs presented in Intel Advisor.

Rafael Campos

Rafael Campos is a young researcher at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID), as part of the HPCAS group. His main interests are performance modeling of heterogeneous systems, with focus on performance optimization of bioinformatics applications and roofline modeling of high-performance heterogeneous CPU/GPU systems.

Zakhar Matveev

Zakhar A. Matveev, Principal Engineer, PhD, is a product architect for Intel ® Advisor tool on x86 CPU and Intel GPUs. His focus and professional interests are in the areas of high performance computing, parallel programming, hardware/software co-design and computer graphics.

1:15 - 1:45 PM CT

Visual Analysis Challenges in the Age of Data

Ninety percent of all data in the world has been created in the past two years alone, at a rate of exabytes per day. New data of all kinds — structured, unstructured, quantitative, qualitative, spatial, and temporal — is growing exponentially and in every way. Given the vast amount of data being produced, one of our greatest scientific challenges is to effectively understand and make…

WATCH

Presenting

Chris Johnson

Chris R. Johnson is a Distinguished Professor of Computer Science and founding director of the Scientific Computing & Imaging (SCI) Institute at the University of Utah. He is a Fellow of AIMBE (2004), AAAS (2005), SIAM (2009), and IEEE (2014) and was inducted into the IEEE Visualization Academy (2019).

1:45 - 2:15 PM CT Break

2:15 - 2:45 PM CT

A Synergistic Approach for Abstracting Hardware Heterogeneity and Reducing Algorithmic Complexity: Powering HiCMA with oneAPI for HPC Scientific Applications

We leverage performance of HPC scientific applications using tile low-rank matrix computations. The idea consists in revisiting tile algorithms using low-rank matrix approximations by exploiting the data sparsity of the dense operator coming from computational astronomy, seismic imaging, and climate/weather prediction applications. We rely on the HiCMA software library for providing sequential numerical kernels and oneAPI runtime system for orchestrating the resulting computational tasks onto…

WATCH

Presenting

Hatem Ltaief

Hatem is the principal research scientist in the Extreme Computing Research Center at KAUST. Hatem’s research interests include parallel numerical algorithms, parallel programming models, HPC, and performance optimizations. He has been collaborating with domain scientists on leveraging their applications to meet the challenges at exascale.

12:45 - 1:15 PM CT

Getting Ready to Aurora exa-scale supercomputer using Intel Advisor Roofline on Intel CPUs and GPUs

Aurora at Argonne National Laboratory is one of US DOE’s exa-scale supercomputers that will be deployed in 2022. OneAPI provides all essential components for porting applications to Aurora with optimal performance. OneAPI Intel Advisor roofline features provide intuitive performance analysis results on Intel GPUs, and useful insights about performance bottlenecks for further optimization. We present our Advisor use-cases from our workloads including MD (molecular dynamics)…

WATCH

Presenting

JaeHyuk Kwack

JaeHyuk Kwack works in the performance engineering group at the Argonne Leadership Computing Facility. He received his B.S. and M.S. in engineering from Seoul National University, South Korea. He earned his Ph.D. and did his post-doctoral training in Computational Mechanics for Computational Fluid Dynamics (CFD) and Fluid Solid Interaction (FSI) problems from the University of Illinois at Urbana-Champaign, USA. Before joining Argonne, he worked for the Blue Waters supercomputing project at the National Center for Supercomputing Applications. He joined Argonne in 2018. At Argonne he has been focused on the OpenMP offloading model, and associated performance tools and math libraries for the coming US DOE Exa-Scale system, Aurora, at Argonne.

Zakhar Matveev

3:15 - 3:30 PM CT Break

2:45 - 3:15 PM CT

Driving a New Era of Accelerated Computing using OpenMP* with Intel® oneAPI Compilers

You are already deeply invested in OpenMP for Multicore, so now just a few additions will launch your code into the xPU era! OpenMP* is a popular, portable, and widely supported programming model. OpenMP provides capabilities for threaded and task-based parallelism for multicore, data-parallel programming using Single Instruction Multiple Data (SIMD) for vector architectures, and most recently support for a programming model for offload to…

WATCH

Presenting

Ron Green

Ron Green is a Compiler Engineer with Intel’s Fortran Team. Ron is a Fortran spokesperson and advocate in the HPC segment. He has been a Fortran/OpenMP developer and consultant for 30+ years, is an IDZ Black Belt Developer and moderator of the Intel Fortran User Forum.

Varsha Madanant

Varsha Madananth is compiler Technical Consulting engineer focused on Intel® C++ Compilers.
Her expertise includes code modernization using parallel programming techniques, compiler optimizations, heterogenous compute and micro-architecture tuning.

4:00 - 4:30 PM CT

Exploiting Heterogeneous Computing with Intel® oneAPI Threading Building Blocks (oneTBB)

This session will discuss how to utilize Intel® oneAPI Threading Building Blocks (oneTBB) to balance workloads across heterogenous compute resources. As XPU programming grows, applications should be able to utilize CPU + other devices to maximize throughput. Download Presentation Deck

WATCH

Presenting

James Tullos

James is a Technical Consulting Engineer enabling customer success using Intel® Developer Tools.

4:30 - 5:30 PM CT

The oneAPI Software Abstraction for Heterogeneous Computing

WATCH

Presenting

Zheming Jin

Aksel Alpay

Kumudha Krishnamurthy Narasimhan

Ronan Keryell

James Reinders

Moderator: Sujata Tibrewala