2 PetaFLOPS?

20 July 2020

We just finished growing the compute cluster at our nearby Cambridge server facility with another ten computers from Dell. This brings it up to forty compute nodes in total.

Add a pair of NVIDIA Titan V cards into each node and you get amazing compute performance. While NVIDIA has much newer cards, their Volta processors suit what we need to do better than anything else: even NVIDIA’s latest-and-greatest Tesla cards.

The Titan V started off incredibly good value and is now even better as it’s a generation old, heading on two, and therefore not worth so much these days. Unfortunately, they’re getting hard to find, and with cloud GPU resources becoming better and cheaper each day, we think this is the last physical compute cluster we are likely to build.

2 PFLOPS? That’s an awful lot of floating-point operations per second. Peta means 10 to the power of 15. Or a million billion. Or 130,000 calculations per second for each person on the planet Earth.

Well, NVIDIA quotes the Titan V at 110 TFLOPS per card for “mixed-precision matrix-multiply-and-accumulate” which is conveniently something we need quite a lot. Getting eighty Titan Vs to deliver the full 8.8 petaFLOPS has proven elusive, but just over 2PFLOPS equivalent burst performance is more achievable for specific calculations. Even away from highly-optimised matrix sequences, there’s over one FP32 TFLOPS for generic calculations.

The overall specs are crazy: each card has 5,120 CUDA cores, so it’s a little supercomputer in its own right.

What do we do with all this compute power? Modelling. Anything from looking at ripples pass through a driver cone through to watching pressure waves inside a cabinet. When you’re simulating, the finer the time-slices and the higher the spatial resolution, the more you understand what’s going on. This means better designs and nicer sounds.

OK, so maybe it’s just good fun as well.