Nvidia Taps Memory, Switch for AI

发布时间:2018-03-28 00:00
作者:Ameya360
来源:Rick Merritt
阅读量:2370

  At its annual GTC event, Nvidia announced system-level enhancements to boost the performance of its GPUs in training neural networks and a partnership with ARM to spread its technology into inference jobs.

  Nvidia offered no details of its roadmap, presumably for 7-nm graphics processors in 2019 or later. It has some breathing room, given that AMD is just getting started in this space, Intel is not expected to ship its Nervana accelerator until next year, and Graphcore — a leading startup — has gone quiet. A few months ago, both Intel and Graphcore were expected to release production silicon this year.

  The high-end Tesla V100 GPU from Nvidia is now available with 32-GBytes memory, twice the HBM2 stacks of DRAM that it supported when launched last May. In addition, the company announced NVSwitch, a 100-W chip made in a TSMC 12nm FinFET process. It sports 18 NVLink 2.0 ports that can link 16 GPUs to shared memory.

  Nvidia became the first company to make the muscular training systems expected to draw 10 kW of power and deliver up to 2 petaflops of performance. Its DGX-2 will pack 12 NVSwitch chips and 16 GPUs in a 10U chassis that can support two Intel Xeon hosts, Infiniband, or Ethernet networks and up to 60 solid-state drives.

  Cray, Hewlett Packard Enterprise, IBM, Lenovo, Supermicro, and Tyan said that they will start shipping systems with the 32-GB chips by June. Oracle plans to use the chip in a cloud service later in the year.

  Claims of performance increases using the memory, interconnect, and software optimizations ranged widely. Nvidia said that it trained a FAIRSeq translation model in two days, an eight-fold increase from a test in September using eight GPUs with 16-GBytes memory each. Separately, SAP said that it eked out a 10% gain in image recognition using a ResNet-152 model.

  Intel aims to leapfrog Nvidia next year with a production Nervana chip sporting 12 100-Gbit/s links compared to six 25-Gbit/s NVLinks on Nvidia’s Volta. The non-coherent memory of the Nervana chip will allow more flexibility in creating large clusters of accelerators, including torus networks, although it will be more difficult to program.

  To ease the coding job, Intel has released as open source its Ngraph compiler. It aims to turn software from third-party AI frameworks like Google’s TensorFlow into code that can run on Intel’s Xeon, Nervana, and eventually FPGA chips.

  The code, running on a prototype accelerator, is being fine-tuned by Intel and a handful of data center partners. The company aims to announce details of its plans at a developer conference in late May, though production chips are not expected until next year. At that point, Nvidia will be under pressure to field a next-generation part to keep pace with an Intel roadmap that calls for annual accelerator upgrades.

  ”The existing Nervana product will really be a software development vehicle. It was built on 28nm process before Intel bought the company and it's not competitive with Nvidia's 12nm Volta design,” said Kevin Krewell, a senior analyst with Tirias Research.

  Volta’s added memory and NVSwitch “keeps Nvidia ahead of the competition. We're all looking forward to the next process shrink, but, as far as production shipping silicon goes, Volta still has no peer,” he added.

  Among startups, Wave Computing is expected to ship this year its first training systems for data centers and developers. New players are still emerging.

  Startup SambaNova Systems debuted last week with $56 million from investors, including Google’s parent Alphabet. Co-founder Kunle Olukotun’s last startup, Afara Websystems, designed what became the Niagara server processor of Sun Microsystems, now Oracle.

  Nvidia currently dominates the training of neural network models in data centers, but it is a relative newcomer to the broader area of inference jobs at the edge of the network. To bolster its position, Nvidia and ARM agreed to collaborate on making Nvidia’s open-source hardware for inferencing available as part of ARM’s planned machine-learning products.

  Nvidia announced last year that it would open-source IP from its Xavier inference accelerator. It has made multiple RTL releases to date. The blocks compete with AI accelerators offered byCadence, Ceva, and Synopsys, among others.

  Just what Nvidia blocks that ARM will make available when remains unclear. So far, ARM has only sketched out its plans for AI chips as part of a broad Project Trillium. An ARM representative would only say that ARM aims to port its emerging neural net software to the Nvidia IP.

  Deepu Talla, general manager of Nvidia’s group overseeing Xavier, said that he is aware of multiple chips being designed using the free, modular IP. However, so far, none have been announced.

  Nvidia hopes that the inference effort spreads use of its machine-learning software also used in training AI models. To that end, the company announced several efforts to update its code and integrate it into third-party AI frameworks.

  TensorRT 4, the latest version of Nvidia’s runtime software, boosts support for inferencing jobs and is being integrated into version 1.7 of Google’s TensorFlow framework. Nvidia is also integrating the runtime with the Kaldi speech framework, Windows ML, and Matlab, among others.

  Separately, the company announced that the RTX software for ray tracing that it announced last week is now available on V100-based Quadro GV100 chips, sporting 32-GBytes memory and two NVLinks.

  The software enables faster, more realistic rendering for games, movies, and design models. It runs on Nvidia proprietary APIs as well as Microsoft’s DirectX for ray tracing and will support Vulkan in the future.

  The software delivers 10x to 100x improvements compared to CPU-based rendering that dominates a market that forecasts to be larger than $2 billion by 2020, said Bob Pette, vice president of Nvidia’s professional visualization group.

(备注:文章来源于网络,信息仅供参考,不代表本网站观点,如有侵权请联系删除!)

在线留言询价

相关阅读
NVIDIA Confirms Development of “Compliance Chips” for the Chinese Market
  According to IJIWEI’s report, NVIDIA recently confirmed that it is actively working on new “compliant chips” tailored for the Chinese market. However, these products are not expected to make a substantial contribution to fourth-quarter revenue.  On November 21, during NVIDIA’s earnings briefing for the third quarter of 2024, executives acknowledged the significant impact of tightened U.S. export controls on AI. They anticipated a significant decline in data center revenue from China and other affected countries/regions in the fourth quarter. The controls were noted to have a clear negative impact on NVIDIA’s business in China, and this effect is expected to persist in the long term.  NVIDIA’s Chief Financial Officer, Colette Kress, also noted that the company anticipates a significant decline in sales in China and the Middle East during the fourth quarter of the 2024 fiscal year. However, she expressed confidence that robust growth in other regions would be sufficient to offset this decline.  Kress mentioned that NVIDIA is collaborating with some customers in China and the Middle East to obtain U.S. government approval for selling high-performance products. Simultaneously, NVIDIA is attempting to develop new data center products that comply with U.S. government policies and do not require licenses. However, the impact of these products on fourth-quarter sales is not expected to materialize immediately.  Previous reports suggested that NVIDIA has developed the latest series of computational chips, including HGX H20, L20 PCIe, and L2 PCIe, specifically designed for the Chinese market. These chips are modified versions of H100, ensuring compliance with relevant U.S. regulations.  As of now, Chinese domestic manufacturers have not received samples of H20, and they may not be available until the end of this month or mid-next month at the earliest. IJIWEI’s report has indicated that insiders have revealed the possibility of further policy modifications by the U.S., a factor that NVIDIA is likely taking into consideration.
2023-11-23 13:24 阅读量:1460
Ameya360:Quest Global and NVIDIA to Develop Digital Twin Solutions for Manufacturing Industry
  Quest Global is developing new services and solutions, based on the NVIDIA Omniverse Enterprise platform, to deliver the best 3D visualization, simulation, design collaboration, and digital twin solutions for the manufacturing and automotive industries.  Through this association, Quest Global aims to facilitate the transformation of the traditional manufacturing processes and facilities by enabling manufacturers to augment their physical production environments with large-scale, AI and IoT-enabled, digital twin counterparts. These digital twins will enable manufacturers to optimize their manufacturing, logistics, and warehouse processes, reduce waste, and unlock operational efficiencies.  “As organizations work towards enabling their manufacturing operations with predictive analysis, operational efficiencies, and innovative automation, live digital twins of factory solutions play a vital role in achieving that. We are proud to work with NVIDIA to set up an Omniverse center of excellence, with trained engineers and NVIDIA-specific labs and infrastructure. This association is a testament to our commitment towards helping our customers pursue the next frontier of innovation and solve the world’s hardest engineering problems,” said Dushyant Reddy, Global Business Head for Hi-Tech, Quest Global.  NVIDIA Omniverse Enterprise is an end-to-end 3D simulation platform that helps organizations develop and operate physically accurate, perfectly synchronized and AI-enabled digital twins. Building the factories of the future requires uniting disparate datasets from many 3D digital content creation (DCC) and simulation applications in full fidelity, a capability uniquely enabled by Omniverse Enterprise, then connecting to scalable AI platforms such as NVIDIA Isaac Sim for robotics simulation and Metropolis for vision AI applications.  “The industrial metaverse requires innovative simulation and AI capabilities to tackle today’s critical manufacturing and automotive challenges,” said Brian Harrison, Senior Director of Product Management for Omniverse Digital Twins at NVIDIA. “The collaboration between Quest Global and NVIDIA delivers workflow solutions and enhancements that take manufacturing and design collaboration to the next level.”  Quest Global — a long-standing Elite member of the NVIDIA Partner Network – is uniquely positioned to leverage its 3D simulation, engineering, and AI capabilities to help manufacturers quickly develop and harness digital twins of their production environments. The company plans to utilize the capabilities of Omniverse for its customers across industry sectors for product design, optimization and operation of factories of the future, simulation and training of robotics, synthetic data generation for AI training and much more.
2023-02-03 11:44 阅读量:2459
Chipmaker Nvidia plunges after missing on revenue and guidance
Nvidia stock fell as much as 19 percent Thursday after the company reported earnings for the third quarter of its 2019 fiscal year, which ended on Oct. 28.Here's how the company did:Earnings: $1.84 per share, excluding certain items, vs. $1.71 per share as expected by analysts, according to Refinitiv.Revenue: $3.18 billion, vs. $3.24 billion as expected by analysts, according to Refinitiv.With respect to guidance, Nvidia said it's expecting $2.70 billion in revenue in the fiscal fourth quarter, plus us minus 2 percent, excluding certain items. That's below the Refinitiv consensus estimate of $3.40 billion.Overall, in the fiscal third quarter, Nvidia's revenue rose 21 percent year over year, according to its earnings statement.In its fiscal second-quarter earnings, the chipmaker fell short of analyst expectations on guidance despite beating on earnings and revenue estimates. The company's cryptocurrency mining products suffered a hefty decline in that quarter, and the trend continued in the fiscal third quarter.It has become less profitable to use graphics processing units, or GPUs, for mining, according to a recent analysis by Susquehanna. To mine cryptocurrency, computers compete to solve complex math problems in exchange for a specific amount of bitcoin or ethereum. But as both currencies have sunk in value, so too has this segment of revenue for Nvidia."Our near-term results reflect excess channel inventory post the crypto-currency boom, which will be corrected," Nvidia CEO Jensen Huang is quoted as saying in a Thursday press release. In the fiscal third quarter Nvidia's revenue from original equipment manufacturers and intellectual property totaled $148 million, which was down 23 percent year over year but above the FactSet consensus estimate of $102 million. Nvidia chocked up the decline to "the absence of cryptocurrency mining" in its earnings statement.In the quarter Nvidia had a $57 million charge related to older products because of the decrease in demand for cryptocurrency mining."Our Q4 outlook for gaming reflects very little shipment in the midrange Pascal segment to allow channel inventory to normalize," Nvidia's chief financial officer, Colette Kress, told analysts on a conference call after the company announced its results.It will take one to two quarters to go through the extra inventory, Huang said on the call."This is surely a setback, and I wish we had seen it earlier," he said.Inventory issues also affect other brands, Huang said. AMD stock fell 5 percent in extended trading on Thursday.Nvidia's gaming business segment generated $1.76 billion in revenue in the quarter, below the $1.89 billion FactSet consensus estimate.Nvidia's data center segment came in at $792 million in revenue, lower than the $821 million estimate.Revenue for the company's professional visualization business segment was $305 million, surpassing the $284 million estimate.Nvidia, like most other tech stocks, was hit hard in October, which was the worst month for the Nasdaq Composite Index since 2008. The stock is now up 4 percent since the beginning of the year.
2018-11-16 00:00 阅读量:2174
Nvidia Enters ADAS Market via AI-Based Xavier
Nvidia is in Munich this week to declare war that it is coming after the advanced driver assistance system (ADAS) market. The GPU company is now pushing its AI-based Nvidia Drive AGX Xavier System — originally designed for Level 4 autonomous vehicles — down to Level 2+ cars.In a competitive landscape already crowded with ADAS solutions provided by rival chip vendors such as NXP, Renesas, and Intel/Mobileye, Nvidia is boasting that its GPU-based automotive SoC isn’t just a “development platform” for OEMs to prototype their self-driving vehicles.At the company’s own GPU Technology Conference (GTC) in Europe, Nvidia announced that Volvo cars will be using the Nvidia Drive AGX Xavier for its next generation of ADAS vehicles, with production starting in the early 2020s.NVIDIA's Drive AGX Xavier will be designed into Volvo's ADAS L2+ vehicles. Henrik Green (left), head of R&D of Volvo Cars, with Nvidia CEO Jensen Huang on stage at GTC Europe in Munich. (Photo: Nvidia)Danny Shapiro, senior director of automotive at Nvidia, told us, “Volvo isn’t doing just traditional ADAS. They will be delivering wide-ranging features of ‘Level 2+’ automated driving.”By Level 2+, Shapiro means that Volvo will be integrating “360° surround perception and a driver monitoring system” in addition to a conventional adaptive cruise control (ACC) system and automated emergency braking (AEB) system.Nvidia added that its platform will enable Volvo to “implement new connectivity services, energy management technology, in-car personalization options, and autonomous drive technology.”It remains unclear if car OEMs designing ADAS vehicles are all that eager for AI-based Drive AGX Xavier, which is hardly cheap. Shapiro said that if any car OEMs or Tier Ones are serious about developing autonomous vehicles, taking an approach that “unifies ADAS and autonomous vehicle development” makes sense. The move allows carmakers to develop software algorithms on a single platform. “They will end up saving cost,” he said.Phil Magney, founder and principal at VSI Labs, agreed. “The key here is that this is the architecture that can be applied to any level of automation.” He said, “The processes involved in L2 and L4 applications are largely the same. The difference is that L4 would require more sensors, more redundancy, and more software to assure that the system is safe enough even for robo-taxis, where you don’t have a driver to pass control to when the vehicle encounters a scenario that it cannot handle.”Better than discrete ECUsAnother argument for the use of AGX for L2+ is that the alternative requires the use of multiple discrete ECUs. Magney said, “An active ADAS system (such as lane keeping, adaptive cruise, or automatic emergency braking) requires a number of cores fundamental to automation. Each of these tasks requires a pretty sophisticated hardware/software stack.” He asked, “Why not consolidate them instead of having discrete ECUs for each function?”Scalability is another factor. Magney rationalized, “A developer could choose AGX Xavier to handle all these applications. On the other hand, if you want to develop a robo-taxi, you need more sensors, more software, more redundancy, and higher processor performance … so you could choose AGX Pegasus for this.”Is AGX Xavier safer?Shapiro also brought up safety issues.He told us, “Recent safety reports show that many L2 systems aren’t doing what they say they would do.” Indeed, in August, the Insurance Institute for Highway Safety (IIHS) exposed “a large variability of Level 2 vehicle performance under a host of different scenarios.” An EE Times story entitled “Not All ADAS Vehicles Created Equal” reported that some [L2] systems can fail under any number of circumstances. In some cases, certain models equipped with ADAS are apparently blind to stopped vehicles and could even steer directly into a crash.Nvidia’s Shapiro implied that by “integrating more sensors and adding more computing power” that runs robust AI algorithms, Volvo can make their L2+ cars “safer.”On the topic of safety, Magney didn’t necessarily agree. “More computing power doesn’t necessarily mean that it is safer.” He noted, “It all depends on how it is designed.”Lane keeping, adaptive cruise, and emergency braking for L2 could rely on a few sensors and associated algorithms while a driver at the wheel manages events beyond the system’s capabilities.However, the story is different with a robo-taxi, explained Magney. “You are going to need a lot more … more sensors, more algorithms, some lock-step processing, and localization against a precision map.” He said, “For example, if you go from a 16-channel LiDAR to a 128-channel LiDAR for localization, you are working with eight times the amount of data for both your localization layer as well as your environmental model.”Competitive landscapeBut really, what does Nvidia have that competing automotive SoC chip suppliers don’t?Magney, speaking from his firm VSI Labs’ own experience, said, “The Nvidia Drive development package has the most comprehensive tools for developing AV applications.”He added, “This is not to suggest that Nvidia is complete and a developer could just plug and play. To the contrary, there is a ton of organic codework necessary to program, tune, and optimize the performance of AV applications.”However, he concluded that, in the end, “you are going to be able to develop faster with Nvidia’s hardware/software stack because you don’t have to start from scratch. Furthermore, you have DRIVE Constellation for your hardware-in-loop simulations where you can vastly accelerate your simulation testing, and this is vital for testing and validation.”
2018-10-11 00:00 阅读量:2171
  • 一周热料
  • 紧缺物料秒杀
型号 品牌 询价
BD71847AMWV-E2 ROHM Semiconductor
MC33074DR2G onsemi
TL431ACLPR Texas Instruments
CDZVT2R20B ROHM Semiconductor
RB751G-40T2R ROHM Semiconductor
型号 品牌 抢购
IPZ40N04S5L4R8ATMA1 Infineon Technologies
BP3621 ROHM Semiconductor
STM32F429IGT6 STMicroelectronics
BU33JA2MNVX-CTL ROHM Semiconductor
TPS63050YFFR Texas Instruments
ESR03EZPJ151 ROHM Semiconductor
热门标签
ROHM
Aavid
Averlogic
开发板
SUSUMU
NXP
PCB
传感器
半导体
相关百科
关于我们
AMEYA360微信服务号 AMEYA360微信服务号
AMEYA360商城(www.ameya360.com)上线于2011年,现 有超过3500家优质供应商,收录600万种产品型号数据,100 多万种元器件库存可供选购,产品覆盖MCU+存储器+电源芯 片+IGBT+MOS管+运放+射频蓝牙+传感器+电阻电容电感+ 连接器等多个领域,平台主营业务涵盖电子元器件现货销售、 BOM配单及提供产品配套资料等,为广大客户提供一站式购 销服务。