Targeting Intel’s DDIO
White hat security researchers from Vrije Universiteit Amsterdam and ETH Zurich have unveiled a new exploit dubbed NetCAT. The exploit targets Direct Cache Access (DCA) – widely known as Data-Direct I/O (DDIO) – found in recent generations of Intel server processors. As a performance optimization feature, DDIO grants network devices and other peripherals access to the CPU cache and inadvertently exposes servers in local untrusted networks to remote side-channel attacks.
According to the researchers, NetCAT represents the first network-based PRIME+PROBE cache attack against the processor’s Last Level Cache (LLC) of a remote machine.
“NetCAT not only enables attacks in cooperative settings where an attacker can build a covert channel between a network client and a sandboxed server process (without network), but more worryingly, in general adversarial settings. In such settings, NetCAT can enable disclosure of network timing-based sensitive information,” the researchers state in a recently published paper that offers an in-depth description of the exploit.
“As an example, we show a keystroke timing attack on a victim SSH connection belonging to another client on the target server. Our results should caution processor vendors against unsupervised sharing of (additional) microarchitectural components with peripherals exposed to malicious input.”
Leaking SSH Passwords
In real-world terms, NetCat enables an attacker to discern an SSH password as it is typed into a terminal. As The Register’s Shaun Nichols explains, a well-positioned eavesdropper can connect to a server powered by one of Intel’s vulnerable chipsets and potentially observe the timing of packets of data – such as keypresses in an interactive terminal session – sent separately by a victim that is connected to the same server.
“These timings can leak the specific keys pressed by the victim due to the fact people move their fingers over their keyboards in a particular pattern, with noticeable pauses between each button push that vary by character,” writes Nichols. “These pauses can be analyzed to reveal, in real time, those specific keypresses sent over the network, including passwords and other secrets.”
As Nichols notes, a determined attacker can monitor keystrokes by repeatedly sending a string of network packets to the server and filling one of the processor’s memory caches.
“As the victim sends in their packets, the snooper’s data is pushed out of the cache by this incoming traffic,” he elaborates. “As the eavesdropper quickly refills the cache, it can sense whether its data was still present or evicted from the cache, leaking the fact its victim had sent over some data. This ultimately can be used to determine the interval between the victim’s incoming packets and thus the keys pressed and transmitted by the victim.”
The Unintended Security Vulnerabilities of DDIO
According to the Vrije Universiteit Amsterdam (VUSec) website, DDIO was specifically introduced to improve the performance of server applications running on fast networks. Indeed, rather than reading and writing from and to slow memory, DDIO enables peripherals to read and write from and to the fast (last-level) cache.
“In traditional architectures, where the network card uses direct memory access (DMA) to talk to the operating system, the memory latency alone quickly becomes the bottleneck on fast (e.g., 10Gb/s) networks,” the researchers explain. “To alleviate the bottleneck, Intel introduced DDIO, an architecture where peripherals can operate direct cache access on the CPU’s (last-level) cache. The DDIO cache region is not dedicated or reserved in the cache, but allocating writes are statically limited to a portion of the cache to avoid thrashing caused by I/O bursts or unconsumed data streams.”
Complexity vs. Security
It is important to understand that hardware-based CPU vulnerabilities were inadvertently created over the years by well-meaning engineers focused on designing ever-faster silicon. To be sure, CPU performance increased significantly in recent decades, with speeds improving every year. This rather impressive feat was made possible by chip architects who leveraged a range of clever techniques to squeeze as much performance as possible out of every transistor, even as the number of available transistors was increasing as per Moore’s Law.
As the years went on, new techniques were adopted for increasing performance while many of the old ones were still in use. The new techniques were more complex, because most of the easy approaches had already been adopted. As a result, the techniques required to increase chip performance became more complex and multi-layered. From a security perspective, this complexity has arguably led to a decrease in security. Increasing silicon complexity – across a diverse range of devices and verticals – practically guarantees that additional vulnerabilities with varying threat levels will continue to be unknowingly introduced into devices and systems. Moreover, a successful attacker has only to identify a single vulnerability, while system designers must secure a multitude of functions and interactions.
Despite real-world security risks, techniques to accelerate CPU speeds remain critical as compute workloads become more processor intensive. Nevertheless, system designers should be considering a more comprehensive and holistic approach to security, rather than simply focusing on the micro-architectural level question of ‘how do we optimize the CPU?’ Rather, they should be thinking about securing the system at the most fundamental architectural level of the system itself. Put simply, semiconductor security is dynamic and should evolve organically to intelligently and proactively protect changing workloads and applications.
Leave a Reply