Loading…
Thursday February 27, 2025 9:00am - 3:30pm CST
Course Skill Level: 25% basic content, 25% intermediate content, and 50% advanced content

Materials: Attendees will need to bring their laptop to access materials during the workshop. There will be power, but please charge in advance as some outlets may need to be shared.

Abstract:
The hand-on workshop will present two performance evaluation tools; HPCToolkit and TAU to evaluate and optimize the performance of GPU accelerated HPC and AI applications.

HPCToolkit (https://hpctoolkit.org) is an integrated suite of tools for profiling and tracing of parallel programs on computers ranging from multicore desktop systems to GPU-accelerated supercomputers and cloud platforms. HPCToolkit can measure and analyze executions of fully optimized, dynamically linked parallel applications on tens of thousands of CPU cores and GPUs. It supports multi-lingual codes with external binary-only libraries. It collects sampling based measurements of CPU codes with a controllable overhead. It measures GPU performance using vendor APIs to collect fine-grained measurements using PC sampling or instrumentation and monitors asynchronous GPU operations using activity APIs. HPCToolkit can attribute performance measurements to rich dynamic calling contexts containing procedures, inlined functions, loop nests, and source lines on both CPUs and GPUs.

The TAU Performance System [http://tau.uoregon.edu] is a versatile performance evaluation toolkit supporting both profiling and tracing modes of measurement. It supports performance evaluation of applications running on CPUs and GPUs and supports runtime-preloading of a Dynamic Shared Object (DSO) that allows users to measure the performance without modifying the source code or binary. This tutorial will describe how TAU may be used with MVAPICH and support advanced performance introspection capabilities at the runtime layer. TAU's support for tracking the idle time spent in implicit barriers within collective operations will be demonstrated. TAU also supports event-based sampling at the function, file, and statement level. TAU's support for runtime systems such as CUDA (for NVIDIA GPUs),Level Zero (for Intel oneAPI DPC++/SYCL), ROCm (for AMD GPUs), OpenMP with support for OMPT and Target Offload directives, Kokkos, and MPI allow instrumentation at the runtime system layer while using sampling to evaluate statement-level performance data.

HPCToolkit and TAU will be demonstrated on AWS using the ParaTools Pro for E4S(TM) image. The Extreme-scale Scientific Software Stack (E4S) [https://e4s.io] is a curated, Spack based software distribution of 100+ HPC and AI/ML packages. The Spack package manager is a core component of E4S and it is a platform for product integration and deployment of performance evaluation tools such as HPCToolkit, TAU, DyninstAPI, PAPI, etc. and supports both bare-metal and containerized deployment for CPU and GPU platforms. E4S provides a Spack binary cache and a set of base and full-featured container images with vendor runtimes to support GPU architectures from NVIDIA, Intel, and AMD. E4S is a community effort to provide open-source software packages for developing, deploying, and running scientific applications and tools on HPC platforms.
Speakers
avatar for John Mellor-Crummey

John Mellor-Crummey

Professor of Computer Science and of Electrical and Computer Engineering, Rice University
John Mellor-Crummey is a Professor of Computer Science at Rice University in Houston, TX. His research focuses on software technology for high-performance parallel computing. His current research focus is tools for measurement and analysis of application performance. He leads the... Read More →
avatar for Sameer Shende

Sameer Shende

Research Professor and Director of the Performance Research Laboratory, University of Oregon
Sameer Shende serves as a Research Associate Professor and the Director of the Performance Research Laboratory at the University of Oregon and the President and Director of ParaTools, Inc. (USA) and ParaTools, SAS (France). He serves as the lead developer of the Extreme-scale Scientific... Read More →
Thursday February 27, 2025 9:00am - 3:30pm CST
9th Floor Room 902

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link