ARM SoCs Take Soft Roads to Neural Nets

Release time:2017-06-30
author:
source:EE Times
reading:352

NXP is supporting inference jobs such as image recognition in software on its i.MX8 processor. It aims to extend its approach for natural-language processing later this year, claiming that dedicated hardware is not required in resource-constrained systems.

The chip vendor is following in the footsteps of its merger partner, Qualcomm. However, the mobile giant expects to eventually augment its code with dedicated hardware. Their shared IP partner, ARM, is developing neural networking libraries for its cores, although it declined an interview for this article.

NXP’s i.MX8 packs two GPU cores from Vivante, now part of Verisilicon. They use about 20 opcodes that support multiply-accumulates and bit extraction and replacement, originally geared for running computer vision.

“Adding more and more hardware is not the way forward on the power budget of a 5-W SoC,” said Geoff Lees, NXP’s executive vice president for i.MX. “I would like to double the Flops, but we got the image processing acceleration we wanted for facial and gesture recognition and better voice accuracy.”

The software is now in use with NXP’s lead customers for image-recognition jobs. Meanwhile, Verisilicon and NXP are working on additional extensions to the GPU shader pipeline targeting natural-language processing. They hope to have the code available by the end of the year.

“Our VX extensions were not originally viewed as a neural network accelerator, but we found [that] they work extraordinarily well … the math isn’t much different,” said Thomas “Rick” Tewell, vice president of system solutions at Verisilicon.

The GPU cores come with OpenCL drivers. “No one has to touch the instruction extensions … people don’t want to get locked into an architecture or tool set; they want to train a set of engineers who are interchangeable.”

One i.MX8 dev kit supports up to eight cameras. (Image: NXP)
One i.MX8 dev kit supports up to eight cameras. (Image: NXP)

ARM is taking a similar approach with its ARM Compute Library, released in March to run neural net tasks on its Cortex-A and Mali cores.

“It doesn’t have a lot of features yet and only supports single-precision math — we’d prefer 8-bit — but I know ARM is working on it,” said a Baidu researcher working on its neural net benchmark. “It also lacks support for recurrent neural nets, but most libraries still lack this.”

For its part, Qualcomm released earlier this year its Snapdragon 820 Neural Processing Engine SDK. It supports jobs run on the SoC’s CPU, GPU, and DSP and includes Hexagon DSP vector extensions to run 8-bit math for neural nets.

“Long-term, there could be a need for dedicated hardware,” said Gary Brotman, director of product management for commercial machine-learning products at Qualcomm. “We have work in the lab today but have not discussed a time-to-market.”

The code supports a variety of neural nets, including LSTMs often used for audio processing. Both NXP and Qualcomm execs said that it’s still early days for availability of good data sets to train models for natural-language processing. “Audio is the next frontier,” said Brotman.

Online messageinquiry

reading
Drops its involvement in Arm mini China for $775.2 million after the European Union prepare legal proceedingsSoftBank Group, owner of microprocessor IP firm Arm, announced this week that the British firm will sell its majority 51% stake to Chinese investors and ecosystem partners for $775.2 million to form a joint venture for Arm’s business in China. Under the agreement, Arm will still receive a significant proportion of all license, royalty, software, and service revenues arising from Arm China.Arm had already transferred its IP to the joint venture last month, enabling its Chinese operation to enable local chip developers license technology directly in China. This has raised alarm bells within the European Union, with the EU Commissioner for Trade, Cecilia Malmström, launching legal proceedings last week in the World Trade Organization (WTO) against Chinese legislation that undermines the intellectual property rights of European companies.Whether Arm’s hand was forced or not, the company’s rationale for the latest transaction was outlined in SoftBank’s press statement, saying that around 95% of all advanced chips designed in China in 2017 were based on Arm technology, with 20% of the company’s global revenues coming from China in the fiscal year ended March 2018. The statement adds, “The Chinese market is valuable and distinctive from the rest of the world. Arm believes this joint venture, which will license Arm semiconductor technology to Chinese companies and locally develop Arm technology in China, will expand Arm’s opportunities in the Chinese market.”Arm would not comment on the statement, but what the transaction means is that Arm China revenue will no longer be reported under SoftBank’s consolidated accounts once the transaction completes, which it is expected to do this month.The European Union’s Cecilia Malmström said in its case filed on June 1, “Technological innovation and know-how is the bedrock of our knowledge-based economy. It’s what keeps our companies competitive in the global market and supports hundreds of thousands of jobs across Europe. We cannot let any country force our companies to surrender this hard-earned knowledge at its border. This is against international rules that we have all agreed upon in the WTO. If the main players don’t stick to the rulebook, the whole system might collapse.”The EU statement adds that European companies going to China are forced to grant ownership or usage rights of their technology to domestic Chinese entities and are deprived of the ability to freely negotiate market-based terms in technology transfer agreements. This is at odds with the basic rights that companies should be enjoying under the WTO rules and disciplines, particularly under the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement).The case initiated by the EU targets specific provisions under the Chinese regulation on import and export of technologies (known as “TIER”) and the regulation on Chinese-foreign equity joint ventures (known as “JV Regulation”) that discriminate against non-Chinese companies and treat them worse than domestic ones. These provisions violate WTO obligations to treat foreign companies on an equal footing with domestic ones and to protect intellectual property like patents and undisclosed business information.The EU says that if the consultations requested do not reach a satisfactory solution within 60 days, the EU will be able to request that WTO sets up a panel to rule on the matter. While the EU’s request is similar to the one brought recently to the WTO by the U.S., it also identifies further potential violations of WTO rules.The U.K.’s Financial Times reports Rene Haas, Arm’s president of its IP group, suggesting that it’s still business as usual, saying, “The partners in China are still using the very same technology they were using prior to the JV being established. If you are a local partner, the biggest change is that you are now operating on a PRC contract as opposed to a U.K. contract.”
2018-06-08 00:00 reading:377
Arm announced a new mobile CPU core that it said can deliver performance within 10% of Intel’s latest Skylake chips. Analysts praised the architecture’s leap forward but said that they doubt Arm will take a significant share of today’s x86-based notebooks.The Cortex-A76 arrives in tandem with new Mali G76 GPU and V76 video cores. All three are expected to appear in premium smartphone SoCs before the end of the year.The A76 marks a full redesign for mobile systems, packing up to 2-Mbytes L2 cache, 4-Mbytes L3, and running at more than 3 GHz in a 7-nm node. It aims to deliver 90% of the Specint2006 performance of an Intel mobile Skylake chip with one-fourth the area and half the power — or roughly the same performance in thermally constrained systems.“We’re looking to close the gap with Intel … this marks the first step in a new family, and it’s the biggest leap we’ve taken in our roadmap,” said Mike Filippo, an Arm fellow and lead architect for the A76.Compared to an A72 core at 10 nm, a 7-nm A76 should deliver 35% more performance or use 40% less power. That’s a step up from 15% to 25% increases that Arm typically delivers with annual core upgrades. In its day, the A72 delivered about 75% of the performance of Intel’s mobile Broadwell processors.The comparisons are based on CPUs running at similar frequencies. Arm acknowledged that Intel’s chips typically support higher frequencies than Arm’s cores. Although TSMC announced a 4-GHz A72 test chip, few SoC makers are expected to push their designs to such extreme speeds.Arm is preparing a separate core for wired servers and networking gear. The A76 aims to expand Arm’s dominance in smartphones into laptops with 4+4 A76/A55 configurations sporting large caches.“We think you’ll see meaningful volumes in laptops,” said Filippo, but some analysts disagree.Arm-based notebooks lack differentiation, said Bob O’Donnell of Technalysis Research. They offer slightly less performance and about the same price as x86 systems. Although the Arm portables sport longer battery life and often build in cellular modems, O’Donnell doubts that those factors will sway many buyers.That said, Asus, Hewlett Packard, and Lenovo announced Arm-based notebooks running Windows 10 on Qualcomm’s Snapdragon SoC. To date, Qualcomm has been the leading proponent of such designs.With its Cortex-A76, Arm removed performance bottlenecks and optimized features across its mobile core architecture. Click to enlarge. Images: Arm.With its focus on small, low-power cores, Arm will get more benefit from next-generation process technologies than rival Intel, traditionally focused on driving up data rates. Arm claims that the latest 7-nm nodes will only deliver 2% to 3% more speed than the 16-nm node.“There hasn’t been much frequency benefit at all since 16 nm … wire speed hasn’t scaled for some time,” said Peter Greenhalgh, an Arm fellow and vice president of technology.In graphics, the new Mali G76 is the latest high-end implementation of Arm’s Bifrost GPU architecture. It delivers at 7 nm an estimated 50% overall improvement compared to the existing G72 made in a 10-nm process.The G76 can be configured with up to 20 shader cores and an L2 cache configurable from 512 Kbytes to 4 Mbytes. Each shader sports three execution engines.Arm enhanced both the A76 CPU and G76 GPU for machine-learning tasks even though it is about to roll out its first AI-specific cores. The shotgun approach stems in part from Arm’s belief that it’s still early days for what’s likely to be a wide diversity of AI applications needing a variety of implementations.Deep-learning tasks will run four times faster on the A76 and 2.7 times faster on the G76 compared to existing Arm cores. “We are enabling machine learning on everything … as the size of workloads grows, people will move some jobs to GPUs and CPUs for inline work,” said Alex Chalfin, a senior principal graphics architect for Arm.In video, the Mali-V76 improves 4K performance and, running at 800 MHz, can decode a single 8K video stream at 60 frames/second. A next-generation design will support 8K60 encode.The 8K support is initially geared for VR headsets displaying 4K video to each eye. 8K content is not expected to be generally available until 2020, when Japan streams the Summer Olympics in the format.Overall, Arm expects that the A76 will deliver a 35% performance boost over the existing A72 core. Click to enlarge.Overall, “each new core offers significant upgrades for premium smartphones … and Arm’s Dynamiq architecture makes it easier to drop one or two Cortex-A76s into a cluster with the little A55 cores to boost performance in mid-range phones as well,” said Mike Demler, analyst for the Linley Group.“As for the VPU, Arm doesn’t have a display processor core yet to deliver 8K output, but I think there won’t be much of a market for that for a few more years,” he added.Test chips have been taped out for all of the new cores using RTL that Arm shipped about a year ago. Production silicon from SoC customers is eventually expected to span 12-, 7-, and 5-nm nodes.
2018-06-01 00:00 reading:393
ARM plc Monday (May 29) announced its two new application processor cores, the high-end Cortex-A75 and the mid-range Cortex-A55, as part of an ambitious goal to accelerate AI adoption and get an ARM processor core into every IoT device by 2035. The Cortex-A75 offers performance increases versus previous generations, while the Cortex-A55 delivers both performance and efficiency increases. Both cores come with a level of configurability which makes them suitable for all the Cortex-A family’s markets, in contrast to previous cores which have been optimized for specific applications (for example, the A73 for mobile applications or the A72 for servers). Both cores are based on ARM’s brand new DynamIQ technology, which the company is heralding as a way to redefine multi-core processing. “DynamIQ is a fundamental change to the way we build Cortex-A clusters,” said John Ronco, vice president of marketing for the CPU Group at ARM. “There can now be 8 CPU cores in a cluster that are totally different [to each other], different micro-architectures, different implementations, they can run on different voltage domains, at different frequencies... a lot more flexibility has been introduced.” ARM’s DynamIQ technology is a new way of building multi-core application processors.  (Source: ARM) Previously, with ARM’s big.LITTLE scheme, larger cores could be used alongside smaller cores to allow the smaller ones to be used whenever possible to save power. However, there were limitations: only two sizes of core could be used, and they had to be in separate clusters. They also had to have the same setup for power consumption and performance. DynamIQ allows a more mix-and-match approach, with heterogeneous core types in the same cluster, that can be configured or optimized differently. It also includes an upgraded memory subsystem to deal with data flowing between the different cores, and a new specific instruction set for AI tasks. Ronco gave the example of mid-range CPUs in the mobile arena that may previously have used identical cores, such as eight Cortex-A53s -- these setups will likely transition to one big core with seven littles (ie, an A75 plus seven A55s). While this would involve a minor increase in area, it would allow almost double the single thread performance, useful for app start time and other user experience criteria that make a big difference to smartphone applications. While ARM expects the new cores to be used in a wide variety of applications, a key market for DynamIQ-based CPUs is machine learning. The company’s aim is to accelerate AI adoption by improving core performance by a factor of 50 in the next 3-5 years, based on architecture evolution, new micro-architecture features and software optimizations. The Cortex-A55’s performance compared to its predecessor, the A53. All comparisons at ISO process and frequency.  (Source: ARM) “There is lots of innovation going on in machine learning, but it is not a one size fits all problem,” said Ronco. “There will be lots of types of machine learning, and lots of different solutions. For a whole range of workloads, running them on the CPU is going to make sense, particularly for inference at the edge. What we’ve done with DynamIQ is really pushing forward what can be achieved from a machine learning point of view.”
2017-06-05 00:00 reading:413
  • Week of hot material
  • Material in short supply seckilling
model brand Quote
TPS61021ADSGR Texas Instruments
TL431ACLPR Texas Instruments
CD74HC4051QPWRQ1 Texas Instruments
PCA9306DCUR Texas Instruments
TXB0108PWR Texas Instruments
TPIC6C595DR Texas Instruments
model brand To snap up
TPS61021ADSGR Texas Instruments
ULQ2003AQDRQ1 Texas Instruments
TPS61256YFFR Texas Instruments
TXS0104EPWR Texas Instruments
TPS5430DDAR Texas Instruments
TPS63050YFFR Texas Instruments
Hot labels
ROHM
IC
Averlogic
Intel
Samsung
IoT
AI
Sensor
Chip
Information leaderboard
  • Week of ranking
  • Month ranking
About us

Qr code of ameya360 official account

Identify TWO-DIMENSIONAL code, you can pay attention to

AMEYA360 mall (www.ameya360.com) was launched in 2011. Now there are more than 3,500 high-quality suppliers, including 6 million product model data, and more than 1 million component stocks for purchase. Products cover MCU+ memory + power chip +IGBT+MOS tube + op amp + RF Bluetooth + sensor + resistor capacitance inductor + connector and other fields. main business of platform covers spot sales of electronic components, BOM distribution and product supporting materials, providing one-stop purchasing and sales services for our customers.