BOOK DESCRIPTION
As more and more organizations are discovering the use of big data
analytics, interest in platforms that provide storage, computation,
and analytic capabilities has increased. Apache Mahout caters to this
need and paves the way for the implementation of complex algorithms in
the field of machine learning to better analyse your data and get
useful insights into it. Starting with the introduction of clustering
algorithms, this book provides an insight into Apache Mahout and
different algorithms it uses for clustering data. It provides a
general introduction of the algorithms, such as K-Means, Fuzzy
K-Means, StreamingKMeans, and how to use Mahout to cluster your data
using a particular algorithm. You will study the different types of
clustering and learn how to use Apache Mahout with real world data
sets to implement and evaluate your clusters. This book will discuss
about cluster improvement and visualization using Mahout APIs and also
explore model-based clustering and topic modelling using Dirichlet
process. Finally, you will learn how to build and deploy a model for
production use.
WHAT YOU WILL LEARN
* Explore clustering algorithms and cluster evaluation techniques
* Learn different types of clustering and distance measuring
techniques
* Perform clustering on your data using KMeans clustering
* Discover how canopy clustering is used as preprocess step for
KMeans
* Use the Fuzzy KMeans algorithm in Apache Mahout
* Implement Streaming KMeans clustering in Mahout
* Learn Spectral KMeans clustering implementation of Mahout
WHO THIS BOOK IS FOR
Les mer
Produktdetaljer
ISBN
9781783284443
Publisert
2015
Utgave
1. utgave
Utgiver
Packt Publishing
Språk
Product language
Engelsk
Format
Product format
Digital bok
Forfatter