India introduces a 96 core ARM processor of its own design

India introduces a 96-core ARM processor of its own design – this is the first local HPC chip

This week, India’s Advanced Computing Development Center (C-DAC) announced the country’s first homegrown High Performance Computing (HPC) processor. The first Indian chip called Aum is based on the architecture Neoverse V1 Zeus (Arm v8.4) and can be scaled up to 96 cores. It is expected to be launched as early as 2024 and will be manufactured at TSMC’s facilities using 5nm process technology.

    Image Credit: Pexels / National Supercomputing Mission

Image Credit: Pexels / National Supercomputing Mission

Aum was developed as part of the National Supercomputing Mission aimed at reducing India’s reliance on potential export restrictions. To this end, the task was to use a processor architecture developed at national level. In addition, it is planned that in the future the development will compete with Intel and AMD in the areas of high-performance computing and solutions for personal computers.

    Image source: C-DAC

Image source: C-DAC

The A48Z chipsets at the heart of the 96-core Aum chip feature 48 Arm Zeus cores (base frequency 3GHz, turbo 3.5GHz), complemented by 96MB L2 cache with random access and an additional 96MB shared L3 -Cache. Aum chips can be equipped with up to 64 GB of HBM3 memory at 5.6 GHz (although the controller supports up to 6.4 GHz) and a bandwidth of 2.87 TB/s. New items also support up to 16 channels of DDR5 memory up to 5200MHz and provide throughput of 332.8GB/s.

There are 128 PCIe 5.0 lanes, 64 of which allow the use of additional accelerators (such as GPU- or FPGA-based compute accelerators). The remaining 64 will likely be used for the chip’s internal communications fabric, a NUMA-style coherent mesh network fully coherent with memory interconnects based on the CCIX protocol. This network is also used for communication between two Aum sockets and inherits some design cues from AMD’s Infinity Fabric.

Aum offers 4.6 teraflops per socket performance and 3 TB/s of aggregate memory bandwidth. This gives a byte-per-flop ratio of 0.7, well above the 0.38 achieved by Japan’s fastest arm-based supercomputer in the world fugakuand also clearly surpasses the American Summit based on IBM and NVIDIA (<0.2 bytes/flop). However, the Indian chip's TDP will be 300 watts, which means lower power efficiency compared to Fugaku's A64FX arm cores.

If everything goes according to plan, the Indian Aum processor will become a strong contender in the supercomputing space. Importantly, it will be domestic – albeit not to a large extent. Obviously a lot of work has been done to improve the storage subsystem as a whole. C-DAC’s next step could be to adapt the processor core, which would pave the way for a more sovereign chip in India and give impetus to “chip nationalization” in other countries.


About the author

Dylan Harris

Dylan Harris is fascinated by tests and reviews of computer hardware.

Add Comment

Click here to post a comment