Room 205 A, Music Hall
Thu Jun 12th 1 p.m. to 5 p.m.
Nashville
Summary
This year, the tutorial focuses on describing techniques to allow deep learning practitioners to accelerate the training and inference of large deep transformer networks while also reducing memory requirements across a spectrum of off-the-shelf hardware for important applications such as autonomous driving and large language models. Topics include, but are not limited to:
- Deep learning specialized hardware overview. We will review the architecture of most commonly used deep
learning acceleration hardware: GPU and TPU. We will cover main computational processors and memory
modules.
- How deep learning is performed on this hardware. We will cover aspects of algorithmic intensity and
provide overview of theoretical aspects of compute. Attendees will learn how to estimate processing time and
latency by looking only at hardware specs and the network architecture.
- Best practices for acceleration. We will provide an overview of best practices to design efficient neural
networks. Topics of interest will include guidance for channel number selection, compute heavy operations,
reduction operations etc.
- Existing tools for model acceleration. In this part we will focus on existing tools to accelerate a trained
neural network on GPU devices. Particularly, we will discuss operation folding, TensorRT, ONNX graph
optimization, sparsity.
- Research overview of recent techniques. This part will cover most recent advanced techniques for post
training model optimization with the focus on most recent works (past 4 years). Range of topics will include
pruning, quantization, model distillation, NAS etc.
- Foundation models. This part will cover most recent advanced techniques for training and deploying foun-
dation models efficiently.
|
Schedule
13:00 | 13:05 | | Opening Remarks |
13:05 | 14:30 | | Jason Clemons | | Understanding Hardware For Optimization. |
| | | | | Hardware Overview. |
| | | | | DL Performance Considerations for GPUs. |
| | | | | Suggestions. |
| | | | | Some Tools for Optimization, |
14:40 | 15:15 | | Alex Sun | | hardware-aware transformer acceleration - maximizing your GPU utilization. |
15:30 | 16:30 | | Hongxu (Danny) Yin | | Towards Efficient Multi-modal Foundation Models. |
16:30 | 17:00 | | Q&A and Closing Remarks |
Instructors
 |
Alex (Xinglong) Sun is an AI researcher at the Applied AV Research team at NVIDIA. Prior to joining NVIDIA, Alex pursued his graduate studies and research at Stanford and UIUC majoring in computer science. He is interested in efficient deep learning, visual perception, and end-to-end autonomous driving. |
 |
Jason Clemons received his Ph.D. in computer science and engineering from the University of Michigan, Ann
Arbor, MI, USA where he researched computer architectures for mobile computer vision. In his senior research
scientist role at NVIDIA his current research focuses on domain-specific computing, in particular the intersection
of machine learning, computer vision, and computer architecture. He has worked on machine learning accelerators,
computer vision accelerators, accelerating DNN training on GPUs, and accelerating RL using GPUs. He is an IEEE
senior member and serves on IEEE International Symposium on Performance Analysis of Systems and Software
steering committee. |
 |
Hongxu (Danny) Yin is a senior research scientist at Learning and Perception Research (LPR) at NVIDIA. He
obtained his Ph.D. at Princeton University, New Jersey, USA, and B. Eng. from Nanyang Technological University, Singapore. He is a recipient of Princeton Yan Huo 94* Graduate Fellowship, Princeton Best Dissertation
Finalist within Department, Princeton Natural Sciences and Engineering Fellowship, Defense Science & Technology Agency gold medal, and Thomson Asia Pacific Holdings gold medal. His research interests mainly include
data-/execution-efficient and secure deep learning overseeing CNNs and transformers. He has been the organizer
of several tutorial/workshop at CVPR and ICCV. He has been featured as Global Outstanding Chinese Power 100
Award by 36Kr and Top 60 Elite Chinese in North America by Forbes. |
Organizers
- Maying Shen, Senior Research Engineer
- Jason Clemons, Senior Research Scientist
- Hongxu (Danny) Yin, Senior Research Scientist
- Pavlo Molchanov, Director of research
- Jose M. Alvarez, Director of research
- Jan Kautz, VP of research
|