In this video from PASC17 conference, Haohuan Fu from Tsinghua University in China presents: Scaling and Optimizing Climate and Weather Forecasting Programs on Sunway TaihuLight.
“The Sunway TaihuLight supercomputer is the world’s first system with a peak performance greater than 100 PFlops and a parallel scale of over 10 million cores. In contrast to other existing heterogeneous supercomputers, which generally include both CPU processors and PCIe-connected many-core accelerators (NVIDIA GPU or Intel MIC), the computing power of TaihuLight is provided by a homegrown many-core SW26010 CPU that includes both the management processing elements (MPEs) and computing processing elements (CPEs) in one chip. This talk reports efforts on refactoring and optimizing the climate and weather forecasting programs – CAM and WRF – on Sunway TaihuLight. To map the large code base to the millions of cores on the Sunway system, OpenACC-based refactoring was taken as the major approach, with source-to-source translator tools applied to exploit the most suitable parallelism for the CPE cluster and to fit the intermediate variable into the limited on-chip fast buffer. For individual kernels, when comparing the original ported version using only MPEs and the refactored version using both the MPE and CPE clusters, we achieve up to 22x speedup for the compute-intensive kernels. For the 25km resolution CAM global model, the code scales to 24,000 MPEs and 1,536,000 CPEs, achieving a simulation performance of 2.81 model years per day.”
Haohuan Fu is deputy director of the National Supercomputing Center in Wuxi, and leads the research and development efforts on Sunway TaihuLight, currently the fastest supercomputer in the world. He is also an associate professor in the Ministry of Education Key Laboratory for Earth System Modeling, and Department of Earth System Science in Tsinghua University, where he leads the research group of High Performance Geo-computing (HPGC). His research interests include design methodologies for highly efficient and highly scalable simulation applications that can take advantage of emerging multi-core, many-core, and reconfigurable architectures, and make full utilization of current Peta-Flops and future ExaFlops supercomputers; and intelligent data management, analysis, and data mining platforms that combine the statistical methods and machine learning technologies. Fu has a PhD in computing from Imperial College London. Since joining Tsinghua in 2011, Dr. Fu has been working towards the goal of providing both the most efficient simulation platforms and the most intelligent data management and analysis platforms for geoscience applications. His research has, for example, led to efficient designs of atmospheric dynamic solvers for the Tianhe-1A, Tianhe-2, and Sunway TaihuLight supercomputers, and reconfigurable computing platforms.
Thanks to Rich Brueckner from insideHPC Media Publications for recording the video.