Profiling used to mean 'attach to a process locally and reproduce the bug'. Continuous profiling — always-on, low-overhead, aggregated — lets you ask 'what was the CPU doing during the spike yesterday' from a UI. 2025-2026 made this cheap enough to leave on permanently.

Advertisement

eBPF-based sampling

eBPF profiler attaches to perf_events, samples stacks at ~100Hz, exports as pprof. <1% CPU overhead. Works across languages without per-language agents. Parca and Pyroscope both support eBPF mode.

What you get

Flamegraphs aggregated across all instances. 'Show me CPU breakdown by function in the checkout service yesterday from 14:00-15:00'. Compare two time windows ('before deploy vs after').

Advertisement

Memory profiling too

Java: JFR allocation profiling. Go: heap profile via pprof. Pyroscope ingests both. Memory leaks become visible as a steady growth in retained heap for a specific allocation site.

Cost

Storage is dominated by stack-trace cardinality, not request volume. Even at 10K hosts, profile storage is small (~GB/day) compared to logs (~TB/day). The biggest cost: the engineer time spent NOT looking at flamegraphs because they didn't know they were available.

Where it shines

Diagnosing 'why is p99 latency 30% worse since Tuesday's deploy' — flamegraph diff usually reveals it instantly. CPU regressions, lock contention, GC pressure, unexpected I/O — all visible without instrumentation changes.

Always-on eBPF profiling is the 2026 default. &lt;1% overhead, massive debugging leverage.