performance - Apache Spark: map vs mapPartitions? - Stack Overflow
map
converts each element of the source RDD into a single element of the result RDD by applying a function. mapPartitions
converts each partition of the source RDD into into multiple elements of the result (possibly none).
And does flatMap behave like map or like mapPartitions?
Neither: it works on a single element (as map
) and produces multiple elements of the result (as mapPartitions
.
Read full article from performance - Apache Spark: map vs mapPartitions? - Stack Overflow
No comments:
Post a Comment