南大计算机软件新技术国家重点实验室
摘 要:
Building
efficient and scalable system software, especially performance analysis and
monitoring, for large-scale systems, is increasingly important both for the
developers of parallel applications and the designers of next-generation HPC
systems. However, conventional performance tools suffer from significant
time/space overhead due to the ever-increasing problem size and system scale.
For instance, Memory monitoring is of critical use in understanding
applications and evaluating systems. Due to the dynamic nature in programs’
memory accesses, common practice today leaves large amounts of address
examination and data recording at runtime, at the cost of substantial
performance overhead.
On
the other hand, the cost of source code analysis is independent of the problem
size and system scale, making it very appealing for large-scale performance
analysis. Inspired by this observation, we have designed a series of
light-weight system software for HPC systems, such as a memory access
monitoring tool, a performance variance detection tool, and a communication
trace compression tool. In this talk, I will share our experience on building
these tools through combining static analysis and runtime analysis and also
point out the main challenges in this direction.
报告人简介:
翟季冬,清华大学计算机系副教授,博士生导师。主要研究领域为高性能计算、性能评测、大规模并行程序性能分析和优化。2015-2016在斯坦福大学计算机系任访问助理教授。相关研究成果发表在高性能计算领域重要的国际会议和期刊SC、PPoPP、ICS、MICRO、ASPLOS、ATC、CGO、IEEE TPDS、IEEE TC等。其中SC14论文入选会议Best
Paper Finalist,是大陆学者首次入围该奖项。担任NPC
2018程序委员会主席、ACM/IEEE SC 2018和2019程序委员会委员、PPOPP
2019程序委员会委员、IEEE
TPDS编委、FCS和JCST杂志青年编委、中国计算机学会高性能计算专业委员会委员等。担任清华大学学生超算团队教练,指导的团队共八次获得世界冠军。在2015年和2018年包揽了SC、ISC、ASC三大国际超算竞赛的总冠军,实现“大满贯”。其中,SC15冠军是大陆高校首次在该项赛事中获此殊荣。获教育部科技进步一等奖、中国电子学会科学技术一等奖、中国计算机学会优秀博士学位论文奖、国家自然科学基金优秀青年科学基金。
时间:4月30日(星期二)10:00
地点:计算机科学技术楼230室
|