java - Apache spark in memory caching - Stack Overflow
To uncache explicitly, you can use RDD.unpersist()
If you want to share cached RDDs across multiple jobs you can try the following:
- Cache the RDD using a same context and re-use the context for other jobs. This way you only cache once and use it many times
- There are 'spark job servers' that exist to do the above mentioned functionality. Checkout Spark Job Server open sourced by Ooyala.
- Use an external caching solution like Tachyon
Read full article from java - Apache spark in memory caching - Stack Overflow
No comments:
Post a Comment