At GTC 2022 Nvidia announced a new product family that aims to cover from small enterprise workloads through exascale HPC and trillion-parameter AI models. This column highlights the most interesting features of their new Hopper GPU and Grace CPU computer chips and the Hopper product family. We also discuss some of the history behind Nvidia technologies and their most useful features for computational scientists such as the Hopper DPX dynamic programming instruction set, increased number of SMs, and FP 8 tensor core availability. Also included are descriptions of the new Hopper Clustered SMs architecture and updated NVSwitch technologies that integrates their new ARM-based Grace CPU.
As cloud computing grows, the types of computational hardware available in the cloud are diversifying. Field Programmable Gate Arrays (FPGAs) are a relatively new addition to high-performance computing in the cloud, with the ability to accelerate a range of different applications, and the flexibility to offer different cloud computing models. A new and growing configuration is to have the FPGAs directly connected to the network and thus reduce the latency in delivering data to processing elements. We survey the state-of-the-art in FPGAs in the cloud and present the Open Cloud Testbed (OCT), a testbed for research and experimentation into new cloud platforms, which includes network-attached FPGAs in the cloud.
It has been both a strange and unsettling year with a lot of uncertainties related to COVID-19. Both the premier supercomputing conferences ISC'20 and SC20 went virtual, preventing users from exploring the show floors with their examples of novel architectures and seeing the demos of their related tools. Companies have also slowed their procurement of high-end systems during 2020, as was discussed during the presentation of top500.org statistics at the Top-500 Birds-of-a-Feather session at SC20. The Top-500 list is a measurement of the world's largest computers using the High-Performance Linpack benchmark (see www.top500.org). Systems must solve large dense systems of linear equations without using the operation-saving Strassen algorithm or a mix of lower precision, but performing the calculations conforming to LU factorization with partial pivoting using \(⅔n^3+O(n^2)\) double-precision (64-bit) floating point numbers.Despite less turnover of the world's largest computer systems, technologies continue to advance, and the trends are more interesting and diverse than ever.The US dominated the computing industry—including the supercomputer industry—for years, until Japan gave the US a jolt with the introduction of the Earth Simulator back in 2002. This put supercomputing back in the spotlight, especially in the US, but also in Europe. For the past decade, the US, Japan and China have dominated at the top of the Top-500 list. As can be seen from the following table, these systems were built primarily on US-designed chips from IBM and Intel, coupled with accelerators from NVIDIA. The AMD-based Titan Cray/HPE system and the Sunway system in China, as well as the new ARM-based Fujitsu Fugaku supercomputer, are notable exceptions. In this article, we will discuss the history of ARM and its current impact on HPC as well as some other European trends.