Installing the Cassandra / Spark OSS Stack
When assembling an analytics stack, there are usually myriad choices to make. For this build, I decided to build the smallest stack possible that lets me run Spark queries on Cassandra data. As configured it is not highly available since the Spark master is standalone. (note: Datastax Enterprise Spark's master has HA based on Cassandra). It's a decent tradeoff for portacluster, since I can run the master on the admin node which doesn't get rebooted/reimaged constantly. I'm also going to skip HDFS or some kind of HDFS replacement for now. Options I plan to look at later are GlusterFS's HDFS adapter and Pithos as an S3 adapter. In the end, the stack is simply Cassandra and Spark with the spark-cassandra-connector.
Read full article from Installing the Cassandra / Spark OSS Stack
No comments:
Post a Comment