Xilinx Alveo U55C Brings HBM FPGAs to the HPC Market


Xilinx Alveo U55C Cover
Xilinx Alveo U55C Cowl

The Xilinx Alveo U55C marks a brand new push by the corporate to get into the HPC accelerator market, and with a reasonably distinctive angle. Particularly, Xilinx has a tool with networking, FPGA logic area, and HBM designed to speed up some high-performance workloads. Allow us to get into this announcement.

Xilinx Alveo U55C Brings HBM FPGAs to the HPC Market

The Xilinx Alveo U55C is in some methods a means for Xilinx to uniquely enter a market at the moment dominated by NVIDIA. What Xilinx has right here, is principally a smaller model of NVIDIA’s imaginative and prescient for Grace. Whereas that will appear far fetched at first, the Alveo U55C has high-speed community cloth, its personal management processor, the power to speed up workloads utilizing programmable acceleration, and high-bandwidth reminiscence all in a single card. NVIDIA’s Grace continues to be years out, however this imaginative and prescient is right here at this time with the Xilinx Alveo U55C (word NVIDIA has the BlueField-2 A100 that’s its providing at this time, but it’s not as typically accessible.)

Xilinx Alveo U55C High Level Card
Xilinx Alveo U55C Excessive Degree Card

The essential concept right here is that Xilinx permits one to create personalized accelerator logic on the cardboard connected to 16GB of HBM2. If knowledge is available in off of a community interface, it doesn’t must undergo to the host system. The acceleration will be pipelined immediately on card.

Xilinx Alveo U55C Key Features
Xilinx Alveo U55C Key Options

Listed here are the important thing specs for the U55C. There are a couple of factors value noting. First, Xilinx is evaluating this to the Alveo U280, however there are some main variations. The U55C doubles the HBM2 reminiscence, nevertheless it loses the DDR4 reminiscence. The opposite key distinction is the cardboard is now a single slot resolution as an alternative of a twin slot resolution. It additionally helps DDR4. One different attention-grabbing merchandise is that the standard energy is up from 100W to 115W however the most energy is just 150W as an alternative of 225W. That makes it a lot simpler to combine into techniques, particularly in constrained energy deployments.

Xilinx Alveo U55C Specs Compared To U280
Xilinx Alveo U55C Specs In contrast To U280

A little bit of context right here can also be necessary. Whereas one usually thinks of HPC accelerators as the massive 500W+ GPUs that sit in centralized supercomputers, there are a number of workloads which are extra distributed.

Xilinx Alveo U55C For High Speed Memory Bound Applications
Xilinx Alveo U55C For Excessive Pace Reminiscence Certain Functions

An incredible instance of that is the CSIRO case research. The way in which to consider this one is that these playing cards are deployed throughout an enormous radio astronomy antenna array. The IT tools is all photo voltaic powered that means that there are actual energy constraints, so these playing cards use lower than the 115W/150W scores and may solely use 90W. Right here, having the playing cards implies that knowledge will be ingested, and processed through a customized pipeline within the FPGA cloth leveraging HBM2. Playing cards just like the NVIDIA T4 and NVIDIA A2 don’t have HBM onboard nor have they got networking. So a single slot means fewer containers and decrease total energy consumption.

Xilinx Alveo U55C HPC At CSIRO
Xilinx Alveo U55C HPC At CSIRO

Past these scientific HPC pursuits, Xilinx is discussing outcomes of acceleration with LS-DYNA. That is an space the place a big portion of the simulations will be carried out in a customized logic on the FPGA cloth. Then the constraint turns into reminiscence bandwidth and that’s the place the HBM2 is available in. We’ll simply shortly word that the comparability right here is the Intel Xeon Platinum 8260L a 2019 period processor with 24 cores and 35.75MB of cache. We now have the AMD Milan-X with 64 cores and 0.75GB of L3 cache. That L3 cache is designed to supply a major speedup by avoiding the necessity to go to slower DDR4, very like HBM is used though at completely different latency/ capability tiers.

Xilinx Alveo U55C LS DYNA Speedup
Xilinx Alveo U55C LS DYNA Speedup

We requested about pricing for the Xilinx accelerated LS-DYNA however didn’t get a solution as to the way it compares.

Past a single card, Xilinx is scaling out the structure to a number of playing cards so it has options like RoCE v2 and MPI capabilities. These are capabilities required for scale out workloads.

Xilinx Alveo U55C Scale Out System Architecture
Xilinx Alveo U55C Scale Out System Structure

Vitis is Xilinx’s software program platform that enables builders to work inside acquainted frameworks and never have to position logic on FPGAs. Xilinx has been placing a number of effort into this to make it simpler to make use of its merchandise.

XIlinx Vitis Platform November 2021
Xilinx Vitis Platform November 2021

Right here is an instance of how this interprets into HPC-style domains:

Xilinx Vitis Platform Example For U55C
Xilinx Vitis Platform Instance For U55C

After all, it is a newer entrant into many of those areas so having simply accessible software program instruments is essential.

Remaining Phrases

It’s refreshing to see one thing completely different. It is a lot of innovation becoming performance right into a 150W single slot energy envelope. At STH now we have been reviewing servers with 4x 400W or 8x 300W GPUs over the previous few days after which add one other a number of hundred watts for CPUs and NICs. Having an built-in resolution, critically with HBM2 is definitely one thing completely different.

For these questioning, you should purchase the Xilinx Alveo U55C on Xilinx.com. Xilinx can also be engaged on getting these playing cards into numerous associate clouds.

Xilinx Alveo U55C How To Try
Xilinx Alveo U55C How To Strive

Nonetheless, this is without doubt one of the extra distinctive and attention-grabbing options that we’re seeing at SC21. There are a ton of domains the place next-generation 500-600W accelerators are nice, however there are others the place they’re merely not sensible. The massive query is absolutely round whether or not Vitis can unlock the ability of the U55C over the following few quarters to assist the playing cards acquire adoption.

Be the first to comment

Leave a Reply

Your email address will not be published.


*