News Froggy
newsfroggy
HomeTechReviewProgrammingGamesHow ToAboutContacts
newsfroggy

Your daily source for the latest technology news, startup insights, and innovation trends.

More

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Service

Categories

  • Tech
  • Review
  • Programming
  • Games
  • How To

© 2026 News Froggy. All rights reserved.

TwitterFacebook
Programming

Debugging Linux Kernel Freezes: An eBPF Spinlock Saga

This article details the journey of debugging mysterious system freezes caused by eBPF programs in the Linux kernel. We uncovered an issue where an NMI-driven eBPF sampling program would self-deadlock by attempting to acquire a spinlock already held by another eBPF program on the same CPU, leading to 250ms kernel timeouts. The analysis highlights the complexities of spinlocks, NMIs, and cache coherence in kernel development.

PublishedMarch 18, 2026
Reading Time7 min
Debugging Linux Kernel Freezes: An eBPF Spinlock Saga

As developers, we often pride ourselves on creating robust software that "just works." So, when our CPU profiler, Superluminal, started causing periodic full system freezes on a tester's Fedora 42 machine (kernel 6.17.4-200), we knew we had a serious challenge on our hands. This wasn't just a simple crash; the entire system would become unresponsive for short bursts, making traditional debugging nearly impossible. The hunt for this elusive bug led us deep into the Linux kernel's intricate world of eBPF and spinlocks.

Initial Clues from a Frozen System

Our first step was to analyze the system's behavior. Superluminal captures revealed suspicious periods, over 250 milliseconds long, where all threads appeared busy, yet no samples were being collected. Concurrently, dmesg output showed alarming messages like:

INFO: NMI handler (perf_event_nmi_handler) took too long to run: 250.424 msecs

These messages perfectly matched the freeze durations, strongly suggesting a kernel-level issue, specifically within a Non-Maskable Interrupt (NMI) handler. However, trying to attach a debugger to a freezing kernel instance proved futile; gdb itself would crash or time out, leaving us without direct insight into the kernel's state during these critical moments.

Isolating the Problem with a Minimal Repro

With direct debugging stalled, our strategy shifted to creating a minimal reproduction. Superluminal's Linux backend is substantial, involving around 2000 lines of eBPF code. We suspected the issue lay in how our eBPF programs interacted with kernel events. We categorize our eBPF events into three main types: sampling, context switch, and wake events.

Through systematic testing, enabling and disabling these event types, we made a crucial observation:

  • Neither sampling events alone nor context switch/wake events alone caused freezes.
  • Freezes only occurred when both sampling events and context switch events were enabled, even with wake events disabled.
  • Reducing the sampling frequency decreased the frequency of freezes but didn't eliminate them.

This pointed to an interaction bug. We then painstakingly stripped down our eBPF code, keeping only the bare essentials for sampling and context switch events, until we arrived at this minimal, freeze-inducing eBPF program:

c struct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 512 * 1024 * 1024); } ringBuffer SEC(".maps");

SEC("tp_btf/sched_switch") int cswitch(struct bpf_raw_tracepoint_args* inContext) { struct CSwitchEvent* event = bpf_ringbuf_reserve(&ringBuffer, sizeof(struct CSwitchEvent), 0); if (event == NULL) return 1; bpf_ringbuf_discard(event, 0); return 0; }

SEC("perf_event") int sample(struct bpf_perf_event_data* inContext) { struct SampleEvent* event = bpf_ringbuf_reserve(&ringBuffer, sizeof(struct SampleEvent), 0); if (event == NULL) return 1; bpf_ringbuf_discard(event, 0); return 0; }

These programs do almost nothing beyond attempting to reserve and then immediately discard space in a BPF ring buffer using bpf_ringbuf_reserve and bpf_ringbuf_discard.

Unmasking the Spinlock Issue

Given the minimal code, bpf_ringbuf_reserve became our prime suspect. A quick look at its kernel implementation revealed it's guarded by a spinlock: raw_res_spin_lock_irqsave and raw_res_spin_unlock_irqrestore. These functions are designed to disable local interrupts and preemption to protect critical sections of code. However, the local_irq_save component only disables maskable interrupts.

Our key observation about sampling events, which trigger Non-Maskable Interrupts (NMIs), immediately sparked a hypothesis:

  1. An eBPF program, perhaps the context switch handler, acquires the ring buffer spinlock on a CPU.
  2. This spinlock disables maskable interrupts but critically, not NMIs.
  3. While the lock is held, a sampling NMI occurs on the same CPU.
  4. The NMI handler, running on the same CPU, then also attempts to acquire the same ring buffer spinlock.

Since the spinlock is already held by the initial eBPF program on that CPU, the NMI handler would enter a spin-wait loop. Crucially, the spinlock implementation includes a timeout to prevent indefinite spinning. The RES_DEF_TIMEOUT constant, often used in these spin-wait loops, is defined as NSEC_PER_SEC / 4, which is precisely 0.25 seconds, or 250 milliseconds.

This was our "smoking gun." The 250ms timeout in the spinlock perfectly matched the observed 250+ ms system freezes and the NMI handler dmesg warnings. The system was effectively self-deadlocking: an NMI handler on a CPU would attempt to acquire a spinlock already held by code on the same CPU, which it could never release because the NMI blocked its execution. The spinlock would eventually time out, causing the observed freezes.

A Primer on Spinlocks and Their Pitfalls

This incident highlights some fundamental challenges with spinlocks, especially in kernel contexts. A basic spinlock works by repeatedly attempting an atomic compare-and-swap (CAS) operation until it successfully changes a locked flag from 0 to 1. If the CAS fails, it means another thread holds the lock, and the current thread "spins" in a loop, wasting CPU cycles.

Beyond wasted cycles, spinlocks can suffer from severe performance degradation due to "cache line bouncing." Modern CPUs use protocols like MESI to maintain cache coherence. When multiple CPUs contend for a spinlock, they repeatedly try to write to the locked flag, which sits in a single cache line. Each write attempt requires a CPU to acquire the cache line in a Modified state, invalidating it in all other CPUs' caches. This generates a constant "storm" of expensive inter-core communication over the memory bus, with performance degrading quadratically with the number of contenders. This also contributes to "unfairness," where no guarantee exists that a waiting thread will eventually acquire the lock, potentially leading to starvation if other threads continually win the race.

In our case, the specific issue wasn't just general contention, but a critical interaction with NMIs and the interrupt masking properties of raw_res_spin_lock_irqsave. The fact that NMIs cannot be masked meant that they could interrupt code holding a spinlock, then attempt to acquire the same lock, leading to a self-deadlock scenario and the subsequent timeout-induced freezes.

Upon reporting our findings to the eBPF kernel mailing list, our analysis was confirmed, leading to further investigations and fixes by kernel maintainers. This journey underscored the importance of understanding the deep interactions between eBPF, kernel primitives, and hardware specifics like NMIs and cache coherence.

FAQ

Q: What distinguishes a Non-Maskable Interrupt (NMI) from a regular interrupt, and why is this relevant to the eBPF spinlock issue?

A: NMIs are special hardware interrupts that cannot be disabled or "masked" by software, unlike regular maskable interrupts. This is critical because kernel code often acquires spinlocks after disabling local interrupts to protect critical sections. If an NMI occurs while such a spinlock is held on the same CPU, and the NMI handler then tries to acquire the same spinlock, it will spin indefinitely (or until a timeout) because the original holder cannot release the lock while the NMI handler is executing.

Q: How does "cache line bouncing" impact spinlock performance, especially in highly contended scenarios?

A: Cache line bouncing, or ping-ponging, occurs when a shared memory location (like a spinlock's locked flag) is frequently written to by multiple CPUs. According to the MESI protocol, a CPU writing to a cache line must acquire it in a Modified state, which requires invalidating that cache line in all other CPUs' caches. When many CPUs contend for a spinlock, they continuously invalidate and re-acquire the cache line, leading to a surge of expensive inter-core communication that dramatically slows down access to the shared resource, often worsening quadratically with the number of contending CPUs.

Q: Why did reducing the eBPF sampling frequency only make the freezes less frequent rather than eliminating them entirely?

A: Reducing the sampling frequency decreases the probability of an NMI (which triggers the eBPF sampling program) occurring precisely when the context switch eBPF program holds the problematic ring buffer spinlock on the same CPU. While the likelihood of this specific race condition decreases, the fundamental flaw in the spinlock's interaction with NMIs remains. Therefore, given enough time or sufficient system load, the specific timing conditions for the freeze can still be met, just less often.

#eBPF#Linux Kernel#Spinlocks#Debugging#Performance

Related articles

Programming
Hacker NewsJun 2

Great Question (YC W21) Seeks Applied AI Interns: A Deep Dive

As fellow developers, we’re constantly scanning the landscape for companies pushing the boundaries, especially in the rapidly evolving AI space. Great Question, a Y Combinator W21 alumnus, has caught our eye with an

Navigating the Global AI Arena: Beyond Silicon Valley's Borders
Programming
Stack Overflow BlogJun 2

Navigating the Global AI Arena: Beyond Silicon Valley's Borders

The international AI landscape presents unique challenges and opportunities, requiring developers to think beyond traditional tech hubs. Key aspects include adapting AI models to local languages and cultures, navigating the complex global supply chain for critical hardware like semiconductors, and understanding how venture capital assesses these international ventures. Success hinges on deep local market understanding, robust technical solutions for localization, and resilience against logistical hurdles.

Programming
Hacker NewsJun 2

Engineering a Solution: Debugging Global Mosquito-Borne Diseases

As developers, we're constantly tasked with solving complex problems, whether it's optimizing a database query or architecting a distributed system. But what if the 'bug' we're trying to fix is biological, with global

Self-Host S3-Compatible Object Storage with MinIO on Staging
Programming
freeCodeCampJun 2

Self-Host S3-Compatible Object Storage with MinIO on Staging

This guide demonstrates how to self-host an S3-compatible object store using MinIO on your staging server. By leveraging Docker Compose and Traefik for HTTPS, you can significantly reduce cloud storage costs while maintaining a production-like environment for development and testing. It covers setup, application configuration, and secure file interactions.

Programming
Hacker NewsJun 1

Unleashing LLMs: A 10-Year-Old Xeon is All You Need

This article explores how a 10-year-old Intel Xeon E5-2620 v4 server with 128 GB DDR3 RAM and no GPU can run a modern LLM like Gemma 4 26B-A4B at reading speed. It highlights that LLM inference is often memory-bound and showcases deep optimization techniques using `ik_llama.cpp`, including speculative decoding, CPU-aware MoE routing, advanced memory management, and specialized attention kernels. The success demonstrates that granular software control can unlock significant performance on older, abundant-RAM hardware.

Secluso: Building Private Home Security on Raspberry Pi with E2EE
Programming
Hacker NewsMay 30

Secluso: Building Private Home Security on Raspberry Pi with E2EE

Reclaiming Privacy in Home Security with Secluso For many developers, the allure of smart home technology, including security cameras, is strong. Yet, the widespread reliance on cloud-based services for video storage

Back to Newsroom

Stay ahead of the curve

Get the latest technology insights delivered to your inbox every morning.