Stream-table duality is the insight underneath Kafka Streams and Flink SQL: every stream of changes is a way to construct a table, and every table is a way to take a snapshot of a stream. Understanding the duality is the unlock for everything else in stream processing.

Advertisement

The basic insight

A stream of (user_id, last_login_time) events. Read all of them and keep the latest per user_id — that's a table. Read the table row-by-row in commit order — that's a stream. Same data; just different views.

KTable in Kafka Streams

A KTable is the table view of a topic. Inserts/updates by key. Backed by RocksDB on the worker. Queries return the latest value per key. Joins between KTable and KStream produce another KStream (enrichment) or KTable (aggregation).

Advertisement

Flink SQL — same idea, different surface

CREATE TABLE flink_orders ... WITH ('connector'='kafka', ...). SELECT, JOIN, GROUP BY work as on any SQL table — but they emit a stream of updates. The continuous query model is built on stream-table duality.

Materialized views

Aggregate a stream into a smaller table; that table is a materialized view of the stream. Updates streamed back. Underpins real-time dashboards, leaderboards, current-state lookups.

Where the duality leaks

Late-arriving events: the table was 'wrong' for a moment. Out-of-order processing: same. Frameworks handle these via watermarks and triggers, but the duality is no longer trivially clean. Read about event time vs processing time before scaling.

Streams build tables; tables snapshot streams. Get this and Kafka Streams + Flink SQL stop feeling like different products.