Advertisement
News
EEtimes
News the global electronics community can trust
eetimes.com
power electronics news
The trusted news source for power-conscious design engineers
powerelectronicsnews.com
EPSNews
News for Electronics Purchasing and the Supply Chain
epsnews.com
elektroda
The can't-miss forum engineers and hobbyists
elektroda.pl
eetimes eu
News, technologies, and trends in the electronics industry
eetimes.eu
Products
Electronics Products
Product news that empowers design decisions
electronicproducts.com
Datasheets.com
Design engineer' search engine for electronic components
datasheets.com
eem
The electronic components resource for engineers and purchasers
eem.com
Design
embedded.com
The design site for hardware software, and firmware engineers
embedded.com
Elector Schematics
Where makers and hobbyists share projects
electroschematics.com
edn Network
The design site for electronics engineers and engineering managers
edn.com
electronic tutorials
The learning center for future and novice engineers
electronics-tutorials.ws
TechOnline
The educational resource for the global engineering community
techonline.com
Tools
eeweb.com
Where electronics engineers discover the latest toolsThe design site for hardware software, and firmware engineers
eeweb.com
Part Sim
Circuit simulation made easy
partsim.com
schematics.com
Brings you all the tools to tackle projects big and small - combining real-world components with online collaboration
schematics.com
PCB Web
Hardware design made easy
pcbweb.com
schematics.io
A free online environment where users can create, edit, and share electrical schematics, or convert between popular file formats like Eagle, Altium, and OrCAD.
schematics.io
Product Advisor
Find the IoT board you’ve been searching for using this interactive solution space to help you visualize the product selection process and showcase important trade-off decisions.
transim.com/iot
Transim Engage
Transform your product pages with embeddable schematic, simulation, and 3D content modules while providing interactive user experiences for your customers.
transim.com/Products/Engage
About
AspenCore
A worldwide innovation hub servicing component manufacturers and distributors with unique marketing solutions
aspencore.com
Silicon Expert
SiliconExpert provides engineers with the data and insight they need to remove risk from the supply chain.
siliconexpert.com
Transim
Transim powers many of the tools engineers use every day on manufacturers' websites and can develop solutions for any company.
transim.com

AMD-Xilinx Debuts First Versal PCIe Accelerator Card

By   03.08.2022 0

AMD had just barely announced the completion of its acquisition of FPGA maker Xilinx when the entrance sign to the south San Jose Xilinx campus on Union Street (which was once a popular 9-hole golf course) flipped over to display the new owner’s corporate name and logo. Now, a week later, AMD-Xilinx has announced its first Data Center Accelerator Card based on a member of the Versal ACAP (adaptive compute acceleration platform) AI Core Series. (ACAP is the name AMD-Xilinx uses to designate its newest line of SoCs based on FPGA technology.)

The new card, dubbed the Xilinx VCK5000, looks like your typical FPGA-based PCIe accelerator card designed to boost the performance of key applications being run in servers and data centers. These key applications include AI and ML (machine learning) applications in addition to many other varied tasks such as genomics, drug discovery, data analytics, and video transcoding. Of course, Nvidia is the 800-pound gorilla in this space and the performance benchmarks that AMD-Xilinx is using are aimed straight at that competitor.

Over the past several months, using the same Xilinx VCK5000 hardware, AMD-Xilinx has been able to boost the performance of this accelerator card by a factor of 2.5x to 3x as measured by two specific figures of merit: performance/watt and performance/dollar (aka TCO) for one specific ML workload: ResNet-50 v1.5. AMD-Xilinx has focused on these figures of merit because the absolute performance of the Xilinx VCK5000 is not at the top of the pack, but when also considering its low power consumption and lower acquisition costs, the AMD-Xilinx acceleration card looks very good.

More specifically, AMD-Xilinx claims that the Xilinx VCK5000 can outperform GPUs in these two figures of merit because of throughput efficiency, due to the inherent architectural advantage of an FPGA-based implementation that squeezes out the “data bubbles” inherent in ML applications.

Partner Content
View All
By Andrew Younge, R&D Manager, Scalable Computer Architectures, Sandia National Laboratories  10.13.2023
By Christine Baissac-Hayden, SC23 Communications Chair  10.05.2023

Fixed GPU architectures are designed to handle data chunks in fixed sizes, so when the size of these chunks varies, GPU architectures can have difficulty accommodating these changes. The resulting data bubbles lead to computational inefficiency because the GPU’s computational elements lack data to crunch much of the time. By contrast, the reprogrammable nature of the programmable logic in an AMD-Xilinx ACAP allows the device to be reconfigured so that the hardware more closely matches the data formats being used for the computation at hand.

However, ResNet-50 performance — or any benchmark performance for that matter — is not the only story for FPGA-based ML implementations. In real-world ML applications, running the ML model to identify objects is not the be-all or end-all for the application. There are other practical tasks to be accomplished, as shown in the image below.

In the example above, the first step, which precedes image recognition using an ML model, is to decode and resize the incoming video stream to match the data input requirements of the ML model. After the model has detected, identified, and classified the object(s) in the video, there are additional tasks to perform including object cropping, image resizing, and object tracking. The programmable logic in an FPGA or ACAP can be configured to implement these additional tasks while a GPU is less suited for this sort of computational work due to its relative lack of algorithmic flexibility.

The very best metric or benchmark for any processor, AI or otherwise, is the actual application or workload you will run on the device. That means that standardized benchmarks like ResNet-50 can give you a relative feel for performance among alternatives, but you only know for sure when you run your target application on the processor. Vendors also use TOPS as an easily derived proxy for performance, but the calculated Peak TOPS for a processor is probably not the same as the actual TOPS delivered while running real AI workloads. Different users prefer different metrics for comparing alternative accelerators. One user may prefer the fastest execution for specific workload(s), another may prefer the best performance/watt, and yet another may prefer the best performance/dollar. No AI accelerator excels at all three.

 

0 comments
Post Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles