Arm Launches Cortex-A320 for AIoT Devices
|
NEWS
|
The Artificial Intelligence of Things (AIoT) represents a new class of Internet of Things (IoT) devices, capable of running some or all AI/Machine Learning (ML) processing locally, rather than offloading computation to other devices in the network or cloud. Optimized designs for processors are enabling more complex AI workloads to run on miniature embedded systems—a term that has come to be known as “Tiny Machine Learning (TinyML).”
At the heart of this ecosystem is Arm, with a line of Cortex-A and Cortex-M Central Processing Units (CPUs) and Ethos Neural Processing Units (NPUs) that provide the reference architecture for chipset designers, with their subsequent chip designs that are directly integrated into low-power embedded devices.
In March 2025, Arm announced the release of its Armv9 edge AI platform, which includes the Ethos-U85 NPU, and the brand-new Cortex-A320 CPU. According to Arm, the A320 is “specifically optimized for IoT applications,” with a significant uplift in performance compared to its predecessor—the Cortex-A35.
For the Arm developer community, the Armv9 edge AI platform also includes the extension of its Kleidi software, which provides performance improvements for AI inference workloads without requiring additional developer effort. Kleidi was initially designed to accelerate AI performance on the CPU for mobile applications, but is now available to the IoT. To support the development of varied AI models, Kleidi has optimized AI libraries for ML, supporting a wide range of AI frameworks and models for Machine Vision (MV), audio processing, and inferencing from sensor data across different hardware platforms.
Performance, Efficiency, and Security Enchancements on Low-Power Embedded Systems
|
IMPACT
|
Arm is uniquely positioned to shape the ecosystem for AIoT devices though architectural innovation, working in partnership with its hardware licensees. The launch of the A320 brings about three notable opportunities for developers of IoT systems.
First, the A320 provides greater versatility of AI processing capability for IoT systems. Arm has made this possible through the integration of Armv9 architectural features into the A320, most notably Scalable Vector Extensions 2 (SVE2), which enables the functions of dedicated AI accelerator engines to be executed on the CPU itself. This allows the A320 to run computations for Digital Signal Processing (DSP), and AI workloads for MV and audio processing on the smallest Cortex-A devices.
Second, is improved AI processing power and performance. Arm claims that the A320 can run AI models of over 1 billion parameters on the device itself, with a 30% boost in scalar performance, and up to a 10X ML performance enhancement compared to the A35, as well as over 50% power efficiency gains compared to previous models. Maintaining low-power consumption, while executing increasingly complex AI workloads is vital for developers of power-constrained AIoT devices. If Arm can demonstrate these performance and efficiency gains in AIoT applications, this will increase the addressable market opportunity of the Cortex-A family of processors, which are typically used in higher-power devices in automotive, smartphone, or robotics systems.
Third, is enhanced security. By hardware-optimizing IoT devices to run AI workloads, sensitive or confidential data do not have to be sent across the network to the cloud for processing, reducing the risk that data could be compromised. The A320 also incorporates advanced security features of Branch Target Identification, Pointer Authentication, and Memory Tagging Extension, which Arm says is important when it comes to securing devices that are handling sensitive data and mitigating cyberthreats.
Incorporating the Armv9 Edge AI Platform into Design Roadmaps Hinges on TCO Minimization and OEMs' Willingness to Pay
|
RECOMMENDATIONS
|
For chipset designers, the decision on whether to integrate Arm’s Cortex-A320 and Ethos-U85 for edge AI will depend on the expected reductions in TCO resulting from a more power efficient architecture, and IoT Original Equipment Manufacturers’ (OEMs) willingness to pay premiums for higher-performance local AI processing, depending on Arm’s pricing structures for the Cortex-A320 relative to less-optimized Cortex family alternatives, including the A35.
For Arm’s chipset designer partners seeking market share in the IoT edge AI market, the key will be establishing demand from their own OEM customers. Customers that are already developing AI-capable IoT devices will need to explore whether the uplift in ML processing performance and efficiency, the greater workload versatility, and enhanced security features are sufficient to justify the cost of upgrading and future-proofing their next generation of AIoT devices.
Chipset designers will also have to consider the new innovation opportunities enabled by IoT markets that have yet to invest in on-device AI. If, as Arm claims, the v9 edge AI platform unlocks new opportunities for running Large Language Models (LLMs), Small Language Models (SLMs), and Agentic AI applications on power-constrained IoT devices, OEM customer segments that had previously been cautious about AI at the edge, may now see a more compelling Return on Investment (ROI) opportunity to invest. The scale of this opportunity and OEMs’ willingness to pay, relative to development costs for licensing and integrating the new chipsets, need to be factored into the investment decision.
In particular, Arm’s chipset partners will need to gauge OEMs’ perceived opportunities to add MV, audio processing, and sensor data processing capabilities on far edge devices in asset management, energy management, security, location and tracking, healthcare, condition-based monitoring, and smart city markets. Many IoT OEMs are content, for now, with using Generative Artificial Intelligence (Gen AI) applications that relay data to the cloud for processing, preferring the flexibility of pay-as-you-go cloud subscription models. For some, the addition of far edge AI processing power does not yet make economic sense, especially where inferencing is not time-critical, the device has fast and reliable primary connectivity and fallback, or the device is so cost-constrained that adding even low-level inferencing capability would make the device too expensive.