Machine Learning Tutorial: The Naive Bayes Text Classifier | Datumbox



Machine Learning Tutorial: The Naive Bayes Text Classifier | Datumbox

Update: The Datumbox Machine Learning Framework is now open-source and free to download . Check out the package com.datumbox.framework.machinelearning.classification to see the implementation of Naive Bayes Classifier in Java. Note that some of the techniques described below are used on Datumbox's Text Analysis service and they power up our API . What is the Naive Bayes Classifier? The Naive Bayes classifier is a simple probabilistic classifier which is based on Bayes theorem with strong and naïve independence assumptions. It is one of the most basic text classification techniques with various applications in email spam detection, personal email sorting, document categorization, sexually explicit content detection, language detection and sentiment detection. Despite the naïve design and oversimplified assumptions that this technique uses, Naive Bayes performs well in many complex real-world problems. Even though it is often outperformed by other techniques such as boosted trees,

Read full article from Machine Learning Tutorial: The Naive Bayes Text Classifier | Datumbox


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts