Why Your Multicore Processor Isn't as Safe as You Think

General Info

Modern processors are marvels of engineering. Today's chips pack multiple cores, massive caches, and complex memory hierarchies, all working together to make your software run fast. But for systems where failure isn't an option (think aircraft navigation, automotive braking systems, or industrial control) that complexity comes with a hidden cost.

The problem? Shared resources. And specifically, the shared cache.

The Shift to Multicore in Safety-Critical Systems

For decades, safety-critical systems relied on custom hardware or dedicated single-core processors. These were predictable, certifiable, and importantly, isolated. One task, one processor, no interference.

But that world is changing fast. Single-core processors are increasingly rare on the market, and the cost of developing custom silicon is prohibitive. So avionics manufacturers, automotive OEMs, and industrial system designers are turning to Commercial-Off-The-Shelf (COTS) multicore processors, the same kind of chips powering your laptop or server.

This introduces a new challenge: on a multicore chip, multiple applications share the same last-level cache (LLC). A non-critical task (say, a logging daemon running alongside a flight-critical control loop) can unintentionally (or intentionally) evict the critical task's data from the cache, forcing it to fetch from slower main memory. This isn't just a performance annoyance. In a hard real-time system, it can violate timing guarantees.

Fig. 1 — A non-critical task evicting a safety-critical task's data from the shared LLC, forcing a costly DRAM fetch.

Cache Partitioning: The Obvious Fix

The intuitive answer to cache interference is cache partitioning: slice the cache into non-overlapping regions and assign each application its own exclusive section. If the logging daemon can only ever touch its own 2 MB of cache, it can never evict the flight controller's data.

There are two broad approaches:

Static partitioning allocates fixed regions at boot time. It's predictable, easy to reason about, and great for certification, but wasteful. If one partition sits mostly empty while another is under pressure, there's no mechanism to rebalance.

Dynamic partitioning adjusts allocations at runtime based on demand. This is more efficient, but introduces overhead from continuous monitoring and reallocation, and critically, opens new attack surfaces.

Hardware vendors have built dedicated support for this. Intel's Cache Allocation Technology (CAT), part of its broader Resource Director Technology (RDT) framework, lets software control data placement in the LLC. ARM has its equivalent in Memory Partitioning and Monitoring (MPAM). AMD, notably, does not have a dedicated hardware mechanism and relies on OS-level policies instead.

Fig. 2 — Hardware cache partitioning support by vendor: Intel provides CAT/CMT/MBA via RDT, ARM offers MPAM, and AMD relies on OS-level policies.

But Cache Partitioning Isn't Enough

Bill Clinton once said, "And all that connectivity makes us more vulnerable to malware and spyware." He was talking about the internet, but he might as well have been describing a multicore processor. Here's where it gets interesting, and concerning.

Even with cache partitioning in place, two classes of attacks can still compromise isolation:

1. Denial-of-Service (DoS) Attacks

In a DoS attack on the cache, a malicious (or misbehaving) process intentionally floods cache partitions with its own data. The goal is to cause the victim process to experience repeated cache misses, forcing it to constantly reload from DRAM. The latency spike this creates can blow a real-time task's deadline.

Experiments on Intel's Cascade Lake and Ice Lake processors have revealed that CAT does not consistently reduce cache interference in practice. With larger working set sizes, significant performance degradation still occurs, and the behavior varies across processor generations, making it hard to certify.

A more sophisticated variant is the memory-aware DoS attack, which targets DRAM bank structure rather than the cache directly. By crafting linked lists that all map to the same DRAM bank, an attacker can trigger sustained bank conflicts, essentially weaponizing the memory controller. This attack is more effective than cache-only DoS strategies, and it only requires HugePage support (standard on most modern OSes) and knowledge of DRAM bank mapping.

2. Side-Channel Attacks

Side-channel attacks are subtler. Rather than disrupting the victim, they observe it, exploiting the measurable difference in timing between a cache hit and a cache miss to infer what data the victim is accessing.

The classic example is Flush+Reload: an attacker flushes a specific cache line, waits for the victim to run, then measures how long it takes to reload. A fast reload means the victim accessed that data. By repeating this across many cache lines, the attacker can reconstruct sensitive behavior, including recovering a full AES-128 encryption key in near real-time, as demonstrated in experiments.

High-profile attacks like Meltdown, Spectre, and Foreshadow are the most famous examples of this class of vulnerability. They show that even formally isolated applications can leak secrets through the shared microarchitecture.

Fig. 3 — The Flush+Reload attack: the attacker infers victim memory accesses by measuring cache reload times after flushing a cache line.

What Can We Do? Current Countermeasures

Research in this space is active, and several promising directions exist:

Slice-Aware Memory Management
Intel's LLC is physically divided into slices, each associated with a specific CPU core. Slice-aware memory management exploits knowledge of the LLC's addressing scheme to place latency-critical data in the optimal slice, achieving roughly 11% better performance than CAT alone in experiments. The trade-off: it can increase vulnerability to memory-aware DoS attacks by concentrating data in fewer slices.

MemGuard and OS-Level Bandwidth Regulation
Intel's Memory Bandwidth Allocation (MBA) was designed to throttle memory bandwidth per application, but it has a critical flaw: it regulates read and write operations together, not independently. Since reads (cache-line refills) and writes (writebacks) stress different hardware structures, this limits its effectiveness.

Software solutions like MemGuard use performance counters to regulate cache misses and writebacks independently, providing finer-grained control. This has proven effective at protecting real-time tasks from interference.

SecDCP: Secure Dynamic Cache Partitioning
SecDCP is a particularly promising approach for tackling side-channel attacks in dynamic partitioning scenarios. It defines security classes (high-security and low-security) and uses a Partition Allocation Algorithm (PAA) to adjust cache sizes based on demand, while a Partition Enforcement Mechanism (PEM) handles the transition with minimal overhead, flushing cache lines only when necessary.

SecDCP shows an average 6.4% performance improvement over static partitioning, with some benchmarks reaching up to 20%. The key insight: by avoiding constant adjustments for confidential partitions and only flushing non-secure caches when reallocating to critical tasks, it limits information leakage without sacrificing too much efficiency.

Fig. 4 — SecDCP splits the LLC into high-security (H) and low-security (L) partitions. The PAA resizes L on demand; the PEM enforces changes with minimal cache flushing to limit information leakage.

Looking Ahead

Cache partitioning is necessary but not sufficient. The field is exploring more radical ideas, like Untrusted Core Isolation (UCI), an architectural model that physically separates trusted and untrusted cores to prevent microarchitectural interference entirely. Early research shows this covers many attack classes, though challenges like cache coherence remain unsolved.

The underlying tension is real: performance, flexibility, and security are in constant conflict in shared-cache systems. Static partitioning maximizes predictability but wastes resources. Dynamic partitioning improves efficiency but introduces attack surfaces. Every countermeasure carries overhead.

What's clear is that COTS multicore processors cannot simply be dropped into safety-critical systems without careful analysis of cache behavior. The certifications that matter in avionics and automotive (ISO 26262, DO-178C) demand guarantees that today's hardware and software don't yet fully provide.

This is an open and important problem, one that will only grow as the industry's reliance on multicore platforms deepens.

References

[1]Bechtel & Yun (2022). Memory-Aware Denial-of-Service Attacks on Shared Cache in Multicore Real-Time Systems. IEEE Transactions on Computers.
[2]Sohal et al. (2022). A Closer Look at Intel Resource Director Technology (RDT). RTNS '22.
[3]Wang et al. (2016). SecDCP: Secure Dynamic Cache Partitioning for Efficient Timing Channel Protection.
[4]Asmussen et al. (2025). Distrusting Cores by Separating Computation from Isolation. Journal of Systems Architecture.