Stateful Streaming in Spark and Kafka Streams
TLDR: This article is about aggregates in stateful stream processing. It covers two concrete examples in Apache Spark (using the Streaming API with mapWithState) and Apache Kafka (using the high-level DSL in Streams). Spark Streaming and Kafka Streams differ much. Therefore the reader can get to know both approaches and can decide which fits best. I discuss what features for real-time aggregates besides reliability and scaling requirements we would like to see covered (e.g. to be queryable, re-processable, versionable, composable, unlimited updates, providing retention times, downsampling and having end-to-end guarantees). In the end, I will discuss the example code, also along those requirements.
Read full article from Stateful Streaming in Spark and Kafka Streams
No comments:
Post a Comment