Bert Beals of Cray Inc. recently told the Digital Energy Journal that the industry can no longer simply build an efficient supercomputer for seismic processing by simply adding more processors.
Indeed, because Dennard Scaling no longer applies, advanced microprocessors now require more power and additional cooling for heat dissipation. Moreover, even though engineers might fit more transistors on a microchip, clock rates are not expected to increase significantly and transistors may have to remain dark in order to deal with thermal limitations.
According to the Digital Energy Journal, if systems designers want to increase processing power, they are likely to consider more physical computers, rather than creating more densely packed microchips to dissipate the heat generated over a large volume. This means increased demands on interconnects and network architectures to support efficient intercommunication between an ever growing number of compute nodes.
As such, says Beale, supercomputers must support a more parallel architecture. Such a paradigm requires a different kind of interconnect, memory hierarchy and input-output strategy instead of a serial optimization approach.
“You have to think about the overall systems architecture, combined with software architecture, combined with the people skills, necessary to deal with processing requirements at massive scale,” he explained. “We have to carefully design our system architectures to keep all the cores ‘fed’. It is very different from buying 1,000 machines on the internet and cabling them together yourself with Ethernet switches. An integrated supercomputing environment with appropriate software and expertise is a much wiser investment than just trying to buy the lowest dollars per flop machine you can buy.”
As Beale notes, seismic processing algorithms already exist that will demand supercomputers perform dramatically faster than current capabilities allow.
“We have requirements from the oil and gas industry which show a need in the next 3-5 years for machines that are 10x what we’re running on today. In the next 10-15 years, we’re going to need machine capabilities that are 100x what we’re running today,” he concluded.
Commenting on the above, Steven Woo, VP of Systems and Solutions at Rambus, told Rambus Press the industry has seen an “increasing emphasis” on rethinking system architectures with a range of newer technologies which can help improve computation and fuel future improvements in data centers and High Performance Computing (HPC) systems.
“As Beale points out, while there are more transistors per chip, clock speeds are plateauing due to power and thermal limitations. Improvements in Instructions Per Clock cycle have plateaued as well,” Woo explained. “With the traditional paths for improving system performance no longer yielding gains at their historic rates, the industry must focus on rethinking system architectures to drive large improvements in performance and power efficiency.”
Further complicating matters, says Woo, is the fact that traditional performance and power efficiency bottlenecks in systems have shifted over the years due to the evolution of both architecture and applications. To be sure, the relentless progression of Moore’s Law (which is now slowing) and clock speed scaling prevalent throughout the 1990s and early 2000s so effectively improved computation capabilities that processing bottlenecks have moved to other areas.
“For example, the rise of big data analytics, in-memory computing, and machine learning has resulted in ever-larger amounts of data being generated and analyzed,” he continued. “In many systems today, so much data is transferred across networks that data movement is itself becoming a critical performance bottleneck. Moreover, the very act of moving data is consuming a significant amount of power, so much so that it’s often more efficient to move the computation to the data instead.”
This is precisely why, says Woo, that there is currently an industry-wide effort to re-examine the architecture of conventional computing platforms by reducing and even eliminating some modern bottlenecks.
“There are a number of recent developments in the industry that address modern HPC and data center bottlenecks like Near Data Processing, the use of accelerators and the adoption of FPGAs. These industry efforts are focusing on both the hardware and the software infrastructure that ultimately will allow applications to achieve large gains in performance and power efficiency,” he added. “[For example], the CCIX consortium is slated to focus on the development of a Cache Coherent Interconnect for Accelerators, [while] the Coherent Accelerator Processor Interface (CAPI) will help enable further system improvements by allowing programmers to choose the most appropriate processors and accelerators to coherently share data. [These] are just two of the many examples of industry efforts to address these bottlenecks.”
Leave a Reply