Rambus VP of Systems and Solutions Steven Woo recently penned an article for ChipEstimate about the changing data center. According to Woo, the evolution of computing from the PC-centric world of the 1980’s-1990’s to today’s mobile+cloud environment has been a primary driver for change in processors, memory, storage and networks.
Clock speeds and the breakdown of Dennard Scaling
“As the nature of processing has changed, so too have each of these major subsystems. Each has improved at different rates. As a result, bottlenecks within computing systems have shifted and new bottlenecks have formed,” he explained. “Take for example the evolution of CPUs. In 1993, high-end processors were achieving clock speeds of up to 66MHz. Over the next 7 years, processor clock speeds improved by an impressive 20x and were hitting 1.5GHz.”
As Woo notes, clock speeds continued to increase in subsequent years, although the breakdown of Dennard Scaling meant another 20x improvement in clock speed wasn’t possible due to power limits and thermal constraints. Fortunately, a thriving Moore’s Law continued to provide copious amounts of transistors, which in turn enabled dramatic CPU improvements to continue through integration and architectural improvements such as multiple cores and multi-threading.
“The rapid evolution of the CPU and the integration of new functionality prompted memory system architects to improve memory bandwidth to keep the growing number of compute pipelines fed with data,” he elaborated. “Memory bus data rates increased to meet rising bandwidth demands, but growing challenges with maintaining signal integrity began to impact the number of DIMM slots per channel – and ultimately – memory capacities available to the processor.”
Multi-drop bus architecture signal integrity challenges
According to Woo, a primary cause of memory system signal integrity challenges is the multi-drop bus architecture that has been used in memory systems for many years. This architecture enables multiple DIMM slots to be provided on the memory channel, with users choosing how many DIMM slots to populate to meet their workload needs. However, as memory bus speeds continued to rise in response to the continuing need for more memory bandwidth, the capacitive loading associated with multiple DIMM modules became a bottleneck.
“[Consequently], increasing memory bus speeds required fewer DIMMs per memory channel to maintain signal integrity on multi-drop buses,” he explained. “To compensate for the loss of memory capacity and to further increase memory bandwidth, processors began to incorporate multiple memory channels. By doing so, processors continued to increase memory bandwidth while maintaining the number of DIMM slots available to the processor by distributing them among multiple memory channels.”
The move to multi-core and multi-threaded processors, says Woo, helped drive the need for higher memory bandwidth and memory capacity to increase processor utilization. More specifically, workloads that take advantage of multiple cores and multi-threading require a certain amount of memory bandwidth and memory capacity for each core and thread. Limiting the amount of memory bandwidth or capacity runs the risk of reducing processor utilization. In extreme cases, some cores or threads may need to sit idle due to lack of memory resources.
Server virtualization and hypervisors
As Woo points out, server virtualization took advantage of these architectural advances and achieved rapid adoption in the early 2000’s. Prior to the adoption of virtualization, enterprises typically had a mix of servers from different manufacturers, each running their own operating systems. These heterogeneous infrastructures often had different numbers of software licenses for each type of machine. Achieving optimal hardware utilization was extremely difficult, as it meant predicting not only which software packages and tools would be used, but also planning for which machines they would be used on. The introduction of hypervisors allowed software to more easily run across heterogeneous infrastructures, improving data center hardware utilization and operating costs.
Buffer chips in the data center
“The challenge of maintaining signal integrity as data rates increased were mitigated – in part – by adding additional silicon to the memory modules. Buffer chips are typically used in server memory systems to improve signal integrity and timing relationships for commands and addresses sent to the memory modules,” he stated. “In some systems, buffers are also used for information sent on the data wires, especially when memory buses are required to support many DIMM modules at the highest data rates. Fully Buffered DIMMs (FB-DIMMs), a memory module and memory bus variant that came to market for a short time, coupled buffer silicon on DIMMs with a change to the memory bus architecture that daisy-chained DIMM modules together.”
Currently, servers employ both Registered DIMMs (RDIMMs) and Load-Reduced DIMMs (LRDIMMs) in multi-drop bus architectures. Both have additional buffer silicon to improve signal integrity for commands and addresses, with LRDIMMs including additional buffer silicon that improves signal integrity on the data bus.
The benefits of buffer chips
“The benefits of buffer chips on memory modules extend all the way to the system level. The additional silicon helps increase data rates, while allowing more modules to be connected to a memory bus. This enables processors to run larger workloads and process them in a more timely fashion,” he continued. “Without some way to support many DIMMs per CPU it would be difficult for processors to access large amounts of data, and processors would run the risk of being heavily underutilized. In this case, obtaining more memory capacity would also mean buying more CPUs to keep important data sets in memory, potentially worsening utilization across all processors in the system and ultimately leading to poor TCO.”
As architectures evolve and speeds increase, buffer chips continue to enable improvements in the memory system. Newer generations of buffer chips enable stacked DRAMs to be used on DIMMs, allowing even higher capacity DIMM modules. Support for lower voltages in the memory system as well as error checking to improve the accuracy and reliability of the data, are additional features that provide benefit at the system level.
GPUs and FPGAs
Looking beyond the memory system, says Woo, GPUs and FPGAs have played a significant role in the evolution of data centers, with GPUs offering superior performance, power efficiency, and compute density for many SIMD-style workloads through the use large numbers of processing pipelines that also offer some amount of programmability.
For other types of workloads, FPGAs provide a more optimized solution due to broader flexibility and re-configurability. For example, while traditional CPUs offer the ability to run complete workloads of any type, FPGAs enable application-specific hardware acceleration and offload capabilities that can be updated over time.
With acceleration done in tailored and reconfigurable hardware, FPGAs provide an additional benefit of improved power efficiency – in part – by not needing to fetch and decode instructions as general purpose processors require. The flexibility afforded by FPGAs makes them a particularly powerful solution when paired with CPUs, allowing the benefits of general purpose computation to be combined with those of hardware-accelerated processing for time-consuming and power-hungry tasks.
“Scaling data centers is becoming increasingly challenging as the industry faces a future without the historic benefits of Moore’s Law and Dennard Scaling, leaving many to wonder if the time will come in the near future for more fundamental architectural changes to take hold,” Woo concluded. “Buffer chips, GPUs, and FPGAs have all helped to evolve server and data center architectures, and may well be opening the door for newer paradigms that can overcome bottlenecks in modern and future workloads. Data centers have been steadily shifting away from a ‘one size fits all’ model and this approach is likely to continue and even intensify in the coming months and years.”