Spark Shell Examples – Altiscale Docs
Copy Test Data to HDFS The following will upload all of our example data to HDFS under your current login username. These include GraphX PageRank's datasets, MLLib decision tree, logistic regression, Kmean, linear regression, SVM, and naive bayes. pushd `pwd` cd /opt/spark/ Second, launch the spark-shell command again with the following command: SPARK_SUBMIT_OPTS="-XX:MaxPermSize=256m" ./bin/spark-shell --master yarn --queue research --driver-class-path $(find /opt/hadoop/share/hadoop/mapreduce/lib/hadoop-lzo-* | head -n 1) Run following Scala statements in Scala REPL Shell: SVM Logistic Regression Naive Bayes KMeans GraphX PageRank Decision Tree - Classification and Regression/Prediction // CLASSIFICATION import org.apache.spark.SparkContext import org.apache.spark.mllib.tree.DecisionTree import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.tree.configuration.Algo._ import org.apache.spark.mllib.tree.Read full article from Spark Shell Examples – Altiscale Docs
No comments:
Post a Comment