Logistic Regression using Mahout | sjsubigdata
Read full article from Logistic Regression using Mahout | sjsubigdata
- Logistic regression is a model used for prediction of the probability of occurrence of an event. It makes use of several predictor variables that may be either numerical or categories.
- It is used to predict a binary response from a binary predictor, used for predicting the outcome of a categoricaldependent variable (i.e., a class label) based on one or more predictor variables (features).
- Logistic regression is the standard industry workhorse that underlies many production fraud detection and advertising quality and targeting products.
- Mahout’s implementation of Logistic regression uses Stochastic Gradient Descent (SGD) algorithm
- This algorithm is a sequential (nonparallel) algorithm, but it’s fast. Because it utilizes the SGD algorithm instead of iteratively reweighted least squares (IRLS), SGD allows for incremental updating, which could be important for some uses. Although there is a parallel algorithm for SGD Parallel Stochastic Gradient Descent, but it was not utilized for mahout.
- While working with large data, the SGD algorithm uses a constant amount of memory regardless of the size of the input.
- Mahout includes a command line example of logistic regression program.
- For production use, the logistic regression stuff mostly is not run from the command line, but is integrated more tightly into some data flow i.e. the logistic regression model is used as part of existing process and code is written to utilize the libraries in the best way possible.
- Mahout’s implementation of Logistic Regression using SGD supports the following command line program names:
Valid program names are:
o cat : Print a file or resource as the logistic regression models would see it
o runlogistic : Run a logistic regression model against CSV data
o trainlogistic : Train a logistic regression using stochastic gradient descent.
Read full article from Logistic Regression using Mahout | sjsubigdata
No comments:
Post a Comment