Apache Mahout is a machine learning library for big data based on Java. It offers a simple and extensible programming environment for creating parallel and scalable machine learning applications. This analytics library already consists of a variety of algorithms for different execution environments such as Scala and Apache Spark, H20, as well as Apache Flink. More recently, Samsara was added to Mahout. Samsara is a vector math experimentation environment and its syntax is similar to statistical programming with R.
Mahout Algorithm Collection
The distribution consists already of a collection of useful algorithms in versions that enable in-core or distributed computing. Examples include stochastic singular value decomposition, stochastic principal component analysis, distributed regularized alternating least squares, collaborative filtering (item and row similarity), naive bayes classification. There is currently no implementation of a parallel and scalable support vector machine.
More Information about Mahout
A nice short overview is the following video:
Follow us on Facebook: