domingo, 24 de enero de 2016

Tuning Up ARM To Do The HPC Math

For ARM processors to take off in the HPC arena, a whole bunch of pieces have to come together to create a platform that can compete against more established architectures. While many have obsessed – and correctly so – over the availability of production-grade 64-bit chips and Linux operating systems, to a certain extent the availability of compilers and their companion math libraries is just as important in the rarified air of HPC.

ARM Holdings, the commercial entity behind the ARM RISC instruction set and licensable processor components, is very eager for ARM chips from its various partners to take off in HPC. In fact, HPC is one of the two target areas where ARM Holdings believes that its eponymous architecture has a chance to take off in the datacenter and build some momentum, with the other area being hyperscale datacenter operators and their service provider peers.

To help accelerate the adoption of ARM chips for HPC workloads, ARM Holdings has been working with Numerical Algorithms Group for the past two years to port the latter company’s Fortran compiler and related Numerical Library to the 64-bit ARMv8-A architecture. But now ARM Holdings is taking it even one step further and is licensing NAG’s Numerical Library and its related software test suite tools so it can distribute those to customers in both open source and commercially supported variants.

Computational mathematics for high end servers and HPC are very important to ARM, and linear algebra routines are important for computational mathematics,” explains Darren Cepulis, datacenter architect and server business development manager at ARM Holdings explains to The Next Platform. “We endeavor to create a set of core math libraries that people can build higher level routines off of. These libraries will be optimized not just for our 64-bit implementations, but also those of our partners. As you know, different architectures and different memory subsystems can impact the performance of different math routines, and so it is important to have a set of libraries that are tuned for the hardware that you are running on.”

To that end, ARM is taking the BLAS, FFT, and LAPACK linear algebra and matrix math routines developed by NAG, which Cepulis says are the most widely used math routines in use on X86 platforms in the HPC space today, and tuning them up for ARM. These three libraries are not the full numerical library from NAG, but it is the key part that will get ARM started for optimized HPC application execution. It is not clear when and if ARM Holdings will license the full Numerical Library set from NAG, but Cepulis is clear that ARM software engineers will be doing further tuning of these three key HPC routines to squeeze more performance out of the ARMv8-A architecture. This optimization work is not a one-off thing, mind you. The architectures of the chips are changing at a steady pace, and there are going to be more implementations of the ARMv8 architecture coming to market this year and next, so the testing and tuning of the math routines will get broader and deeper. Cepulis says that the optimization work will be ongoing for the next couple of years, given the number of implementations that are coming down the pike and the number of compilers with which the math libraries need to integrate.

The math libraries that ARM has licensed will currently work on anything that supports the 64-bit AArch64 architecture, but they have been tuned to work better with ARM’s own Cortex-A57 cores and any chip that makes use of them and the ThunderX processors from Cavium Networks. ARM will be tuning the math libraries up to work with its Cortex-A72 cores next, and presumably Applied Micro’s X-Gene processors, which are also being aimed at HPC workloads, will be next. Others like Broadcom and Qualcomm, which are working on their own beefy ARM server chips, will no doubt join the party, as could others such as Phytium, Marvell, and AMD.

 

 

Source