Apache Spark: Distributed Machine Learning using MLbase - Sigmoid Analytics



Apache Spark: Distributed Machine Learning using MLbase - Sigmoid Analytics

Apache Spark: Distributed Machine Learning using MLbase Implementing and consuming Machine Learning techniques at scale are difficult tasks for ML Developers and End Users. MLbase (www.mlbase.org) is an open-source platform under active development addressing the issues of both groups. MLbase consists of three components—MLlib, MLI and ML Optimizer. MLlib is a low-level distributed ML library written against the Spark, MLI is an API / platform for feature extraction and algorithm development that introduces high-level ML programming abstractions, and ML Optimizer is a layer aiming to simplify ML problems for End Users by automating the tasks of feature and model selection. In this talk we will describe the high-level functionality of each of these layers, and demonstrate its scalability and ease-of-use via real-world examples involving classification, regression, clustering and collaborative filtering. In the course of applying machine-learning against large data sets,

Read full article from Apache Spark: Distributed Machine Learning using MLbase - Sigmoid Analytics


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts