All About Programming: Stateful Streaming in Spark and Kafka Streams

Stateful Streaming in Spark and Kafka Streams

TLDR: This article is about aggregates in stateful stream processing. It covers two concrete examples in Apache Spark (using the Streaming API with mapWithState) and Apache Kafka (using the high-level DSL in Streams). Spark Streaming and Kafka Streams differ much. Therefore the reader can get to know both approaches and can decide which fits best. I discuss what features for real-time aggregates besides reliability and scaling requirements we would like to see covered (e.g. to be queryable, re-processable, versionable, composable, unlimited updates, providing retention times, downsampling and having end-to-end guarantees). In the end, I will discuss the example code, also along those requirements.

Read full article from Stateful Streaming in Spark and Kafka Streams

Stateful Streaming in Spark and Kafka Streams

No comments:

Post a Comment

Labels

Popular Posts