In part one of this two-part blog series, Semiconductor Engineering editor in chief Ed Sperling spoke with Steven Woo, Rambus fellow and distinguished inventor, about the relationship between Moore’s Law and the thermal challenges faced by the semiconductor industry. Specifically, Woo highlighted how the breakdown of Dennard scaling around 2005 prompted GPU designers to place a major emphasis on thermal management and dissipation. In this blog post, we’ll take a closer look at thermal challenges and solutions across multiple verticals and applications including data centers, high performance computing (HPC), mobile devices and various memory types.
As Woo notes, power efficiency has become a top priority across the semiconductor industry in recent years.
“The amount of work that’s required [for thermal management] and the amount of space that’s required to actually deal with the heat is really growing over time – and is something that’s really unsustainable in our industry,” Woo states.
Commenting on battery-powered devices such as smartphones and tablets, Woo says that the mobile industry has developed multiple power-saving techniques like switching off specific system components that are not in use. These components can be quickly restarted when a change of operation is required.
“The ability to keep some silicon dark for certain amounts of time and to quickly turn it on while you turn off other parts of the silicon has been critical to managing the power and thermals for battery operated devices,” he adds.
Commenting on microfluidics, Woo observes that there is a lot of emphasis on using fluids to remove heat from certain systems.
“If you look at solutions like the latest version of the TPU from Google, there’s liquid cooling. If you look at supercomputing you see more and more machines at the high end that are using liquids to move the heat in a way that it can be dissipated more readily,” he explains. “In fact, there are inert liquids that you can even immerse whole boards into. Because they are inert, nothing shorts out. Having that direct liquid contact with something that is inert is by far the best way to remove the heat.”
Woo also emphasizes that DRAM is very sensitive to heat.
“If you are doing things correctly, then you [should be] designing [a system] so that the data movement doesn’t get affected [by the heat]. As a system designer, what you have to think about is how much power is going to be dissipated at the target data rate – and then think about everything you have to do to make sure the temperature remains relatively stable.”
As Woo points out, thermal challenges have prompted chip and memory architects to change the way systems are designed.
“For example, HBM memory uses stacking to try and reduce the distances that data has to move and reduce the amount of heat and power that needs to be dissipated. [The] graphics card [shown below] illustrates that even with those changes there’s still a lot of challenges with managing the thermals,” he elaborates.
“This is a modern graphics card that uses HBM memory and you can see once again it’s quite thick and quite big. As I remove the fan cover here what blows over the of the radiator structure we again see a really large really complicated kind of structure whose job it is to whisk the heat away from the processor.”
In this case, says Woo, the board itself (where the processing is done) is much smaller than what you see on a GDDR6 card.
“However, the thermal solution is just as beefy. So, you have the processor and two HBM RAMs on a very compact board. Again, most of the volume is really dedicated towards dealing with the thermals,” he continues.
“You can also see a copper plate whose job it is to whisk the heat away from the processor and the memory. As well, there is a heat pipe structure which moves that heat across a large set of bladed fins and air flows over those fins to dissipate the heat.”
According to Woo, engineers have formulated extensive design guidelines that detail how air should flow and where the heat should be moving to keep an entire system at a reasonable temperature.
“Over the years people have gone from relatively simple types of fan blade structures (like you can see in this card) to something that’s much more advanced,” he adds.
“There is also a big emphasis on efficiency of those fans, as well as having relatively large surface area of the blades, but also being quiet. The design and the angles and the way that the blades are shaped in heavily influences the ability to both move a lot of air and remain quiet.”
Leave a Reply