IT-Services

AMD's RV670 Does Double-precision At Half The Speed.

Markham (ON) - More and more we see graphics cards manufacturers touting the GFLOPs capability of their cards, hinting to the potentially enormous processing power that is hiding in those graphics processors. But those numbers, which recently hit 1 TFLOPs, aren"t exactly comparable with TFLOPs rankings on the Top500.org list, since there are different instructions and there are different ways to calculate these numbers. An example for this dilemma is AMD"s RV670. Graphics cards manufacturers come up with their peak numbers are derived when GPUs are confronted with the most simple instruction, which can run through all of AMD"s (ATI"s) 320 number-crunching processors and Nvidia"s G80 and G90 all 256 units. When AMD launched its most recent GPGPU part, the FireStream 9170, there were a few of questions floating around, especially the one about its capability to support the double-precision FP64 format. (AMD"s official product page is located here) The product details for the FireStream 9170: AMD states that it can achieve a peak of around 500 GFLOPs for single-precision FP32 format. However, with a general demand of double-precision FP64 support in academia and science and AMD"s claim that the new Firestream can support this format, teh obvious question was how quick this card would be. We spent some time with professors and developers over the past weeks, and heard that they would be perfectly happy if the GPGPU chip would be able to perform DP FP64 calculations 10x slower than FP32, just to be able to have results in double-precision FP64. The potential performance of a multi-GPGPU box would still be a good value for many applications, at least in applications that do not require significant memory amounts only a traditional supercomputer installation can offer. AMD"s Dave "Wavey" Baumann (of ex-Beyond3D fame) told us that while AMD"s RV670 chip is supporting double-precision units, it does not feature individual units for FP64, but uses the FP32 units to do FP64 calculations over a number of cycles. And yes, this process takes time. Depending on complexity of operation, the best case scenario is around half the original SP FP32 performance about 250 GFLOPs; in a worst case, the performance should be about a quarter of its FP32 performance - or about 125 GFLOPs. Dave told us that the chip usually averages out somewhere in between, which is actually quite a feat for a chip that does not feature native FP64 units. At the end of the day, if you"re running double-precision FP64 on AMD"s FireStream 9170 board, you should expect to get between 100 and 250 GFLOPs (realistically, expect the former number). It will be interesting to see how AMD and Nvidia will implement FP64 handling in near future, but for now, expected performance numbers should prove more than tasty to take the plunge and start development of accelerated applications on GPGPU hardware.


Add your comment:
Name:
Site address: http://
Your message:
Enter today\\\\'s date, 2 digits
(spam protection):

News of the day
Intel to source PBGA substrates from Taiwan.
Intel is currently seeking PBGA (plastic ball-grid array) substrate suppliers in Taiwan through its domestic IC packaging and testing contract makers, as the chip giant"s major IC substrate suppliers in Japan are shifting more capacity to flip-chip (FC) substrate production, sources at Taiwan IC packaging firms indicated.
Popular Articles

Storing Data For The Next 1000 Years.
Santa Cruz (CA) ò€“ Have you ever thought how vulnerable your data may be through the simple fact that you may be storing your entire digital life on a single hard drive? On single drive can hold tens of thousands of pictures, thousands of music files, videos, letters and countless other documents. One malfunctioning drive can wipe out your virtual life in a blink of an eye. A scary thought. On a greater scale, at least portions of the digital information describing our generation may be put at risk by current storage technologies. There are only a few decades of life in tape and disk storage these days, but a team of researchers claims to have come up with a power-efficient, scalable way to reliably store data with regular hard drives for an estimated (theoretical) 1400 years.

IDF 2008: Gelsinger Acts Cool, Reveals Little.
San Francisco (CA) - A sunglass-wearing Pat Gelsinger acted cool, but revealed little during his keynote speech at the Intel Developers Forum in San Francisco. Intelò€™s senior vice president and co-general manager of the Digital Enterprise Group did talk about the upcoming power saving features in the Nehalem processor and boasted about recent supercomputing records broken by the companyò€™s Xeon processors.