All About Programming: Spark & Kafka - Achieving zero data-loss

Spark & Kafka - Achieving zero data-loss

Spark Streaming can connect to Kafka using two approaches described in the Kafka Integration Guide. The first approach, which uses a receiver, is less than ideal in terms of parallelism, forcing you to create multiple DStreams to increase the throughput. As a matter of fact, most people tend to depreciate it in favor of the Direct Stream approach that appeared in Spark 1.3 (see the blog post on Databricks' blog and a blog post of the main contributor).

Read full article from Spark & Kafka - Achieving zero data-loss

Spark & Kafka - Achieving zero data-loss

No comments:

Post a Comment

Labels

Popular Posts