U.S. Angles to Retake Supercomputer Lead

发布时间：2017-11-30 00:00

作者：Ameya360

来源： R. Colin Johnson

阅读量：1817

　　The latest Top500 list of the world’s fastest supercomputers turns the spotlight on China, which overtook the United States in the total number of ranked systems and which scored the top two fastest installations on the list. Announcements from IBM, Intel, and Advanced Micro Devices, however, position the U.S. industry for a comeback. Rather than target systems that test well on the Top500’s distributed-memory version of the Linpack benchmarks (High Performance Linpack), the companies aim to render those measurements irrelevant on their way to beating China to exascale computing.

　　China captured not only first and second place in the ranking of the fastest installed systems, but also won the majority share of ranked installations and took the aggregate performance lead, according to the November 2017 Top500 list, which was the 50th one to be published since the ranking debuted in June 1993. According to the Top500 organization, “There is no system from the USA under the Top3. #1 and #2 are installed in China ... the USA decreased to a new record low of 143 [installed Top500-ranked systems] from 169 six months ago. The number of systems installed in China increased to a new record high of 202, compared to 160 on the last list. China now clearly shows a substantially larger number of installations than the USA. China now is also pulling ahead of the USA in the performance category, with China holding 35.4% of the overall installed performance, while the USA is second, with 29.6%.”

　　“The high-performance computing landscape is evolving at a furious pace that some are describing as an important inflection point,” Dave Turek, IBM’s vice president for high-performance computing (HPC) and OpenPOWER, wrote in a recent blog. “Realizing that these demands could only be addressed by an open ecosystem, IBM partnered with industry leaders Google, Mellanox, Nvidia, and others to form the OpenPOWER Foundation, dedicated to stewarding the Power CPU architecture into the next generation.”

　　IBM’s silicon contribution will be its Power9 processor (see photo), housing up to 24 cores with up to 8 billion FinFET transistors cast in 14-nanometer CMOS, 120 megabytes of shared level-three cache, eight-way simultaneous multithreading, and 230 gigabytes/second of bandwidth to memory. Its architecture, to be showcased at Oak Ridge and Lawrence Livermore National Labs, will pack thousands of Nvidia Volta graphic-processing units (GPUs) aimed at boosting overall performance beyond that of China’s home-brewed SunwayCPUs.

　　IBM is banking mostly on its supercomputer data-centric architecture, which spreads out the processing power by embedding the processors at the locations where the data resides. This approach, according to Turek, yields a speedup of 5 to 10 times for the hardest applications: analytics; modeling; visualization; simulation; and artificial intelligence (AI), especially deep learning.

　　To address the specific architectural needs of AI, IBM has redesigned the data flow of its new Power9 processor to dovetail with massive numbers of GPUs and Nervana coprocessors. By scaling TensorFlow and Caffe across 256 Nvidia Tesla GPUs, IBM has been able to reduce deep learning times from 16 days to seven hours. The company aims to balloon this strategy to as many as 100 times more GPUs spread across 50,000 nodes by 2021, thereby achieving exascale computing (a billion billion calculations per second) before China does.

　　“Power9 is loaded with industry-leading new technologies designed for AI to thrive,” IBM Fellow Brad McCredie, vice president of cognitive systems development, wrote in his blog. “With Power9, we’re moving to a new, off-chip era, with advanced accelerators like GPUs and FPGAs [field-programmable gate arrays] driving modern workloads, including AI.”

　　McCredie claims that the Power9 will form the basis of a commercial platform with “giant hose” bandwidth to its GPUs using OpenCAPI. The same OpenCAPI hose will also enable coherent FPGAs to obsolete Top500’s parallel version of the Linpack measurements, which center on systems that merely amass millions of CPUs. Instead, true cognitive benchmarks will enable deep learning metrics that show the superiority of the PowerAI platform on distributed deep learning (DDL) benchmarks, IBM says.

　　Intel likewise is rejecting the mere amassing of CPUs in its Xeon Phi evolution, according to Trish Damkroger, vice president of Intel’s Data Center Group and general manager of its Technical Computing Initiative. The company is ditching the planned Knights Hill version in favor of “a new platform and new microarchitecture specifically designed for exascale,” Damkroger wrote in her blog. “Combined with our comprehensive HPC solutions portfolio, spanning compute, storage, I/O, and software, the updated roadmap is well poised to energize the exascale revolution.”

　　Intel is concentrating on extending its current Scalable System Framework (SSF), combined with new add-on accelerators that outperform a mere matrix of CPUs. It is also diversifying its Select Solutions with on-chip FPGAs and fat-pipe interfaces to 3-D memory cubes. By offering targeted solutions optimized for Big Data, deep learning AI, and other next-generation workloads, the company hopes to obsolete the Top500’s parallel-CPU version of Linpack, as well as make its U.S. Argonne National Laboratory Coral (Collaboration of Oak Ridge, Argonne, and Livermore Labs) systems the first exascale supercomputers worldwide.

　　Likewise, AMD recently announced its reentry into the exascale supercomputer race, by virtue of its new Epyc processors, Infinity interconnection fabric, and Radeon Instinct GPUs. The Epyc optimizes floating-point unit performance, as opposed to the wider vector processors of the Intel Xeon. AMD has already announced supercomputer design wins at Hewlett Packard Enterprise; Supermicro; Penguin Computing; Tyan; ASUS; Gigabyte Technology; BOXX; EchoStreams; and Dell, which will add Epyc servers to its PowerEdge line.

　　AMD has also collaborated with Inventec to produce the Project 47 supercomputer, which has four times as many Radeon Instinct GPUs as Epyc processors and is due to be delivered in the first quarter. And, as tradition would have it, AMD is pricing its solutions below those of Intel and IBM.

（备注：文章来源于网络，信息仅供参考，不代表本网站观点，如有侵权请联系删除！）

上一篇：台湾因重罚高通“内讧”:经济部怂了,公平会坚决不撤销

在线留言询价

品牌

型号

数量

联系人

联系电话

邮箱

PART	数量*	目标价格
	数量最小起订量: 1	目标价格 $ 如不确定，可不填
remark

联系电话 *	姓名
公司
邮箱地址