A method of ingesting, processing, and analyzing data continuously as it flows through a system.
In modern architectures, vast volumes of events must be handled instantly rather than batched. Stream processing enables applications to react to events in near real-time, driving dashboards, alerts, and automated responses.
Ideal when you need results within milliseconds to seconds of data arrival, such as real-time fraud detection or live telemetry monitoring.
You need to know
Windowing - Groups a continuous flow into time-based “buckets” (e.g., every 5 seconds) so you can compute totals or averages.
Stateful operations - Maintains ongoing counts or aggregations (like user session lengths) across those windows, even if parts of the system fail.
Event vs. processing time - You can process by the timestamp the event carried (event time) or when you see it (processing time), which matters for handling late or out-of-order events.
Popular technologies
Kafka Streams - A Java library that builds on Kafka topics for lightweight, embedded stream apps.
Apache Flink – A distributed engine offering robust windowing and exactly-once guarantees.
Amazon Kinesis Data Analytics – A managed AWS service letting you run SQL or Flink jobs on live Kinesis streams.
Like posts like this?
Every week, you'll get a new system design concept, broken down like this one.
Free subscribers also get a little bonus:
🎁 The System Design Interview Preparation Cheat Sheet
If you're into visuals, paid subscribers unlock:
→ My Excalidraw system design template – so you have somewhere to start
→ My Excalidraw component library – used in the diagram of this issue
No pressure though. Your support helps me keep writing, and I appreciate it more than you know ❤️