public class GaussianMixture
extends Object
implements scala.Serializable
Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.
param: k Number of independent Gaussians in the mixture model. param: convergenceTol Maximum change in log-likelihood at which convergence is considered to have occurred. param: maxIterations Maximum number of iterations allowed.
| Constructor and Description | 
|---|
GaussianMixture()
Constructs a default instance. 
 | 
| Modifier and Type | Method and Description | 
|---|---|
double | 
getConvergenceTol()
Return the largest change in log-likelihood at which convergence is
 considered to have occurred. 
 | 
scala.Option<GaussianMixtureModel> | 
getInitialModel()
Return the user supplied initial GMM, if supplied 
 | 
int | 
getK()
Return the number of Gaussians in the mixture model 
 | 
int | 
getMaxIterations()
Return the maximum number of iterations allowed 
 | 
long | 
getSeed()
Return the random seed 
 | 
GaussianMixtureModel | 
run(JavaRDD<Vector> data)
Java-friendly version of  
run() | 
GaussianMixtureModel | 
run(RDD<Vector> data)
Perform expectation maximization 
 | 
GaussianMixture | 
setConvergenceTol(double convergenceTol)
Set the largest change in log-likelihood at which convergence is
 considered to have occurred. 
 | 
GaussianMixture | 
setInitialModel(GaussianMixtureModel model)
Set the initial GMM starting point, bypassing the random initialization. 
 | 
GaussianMixture | 
setK(int k)
Set the number of Gaussians in the mixture model. 
 | 
GaussianMixture | 
setMaxIterations(int maxIterations)
Set the maximum number of iterations allowed. 
 | 
GaussianMixture | 
setSeed(long seed)
Set the random seed 
 | 
static boolean | 
shouldDistributeGaussians(int k,
                         int d)
Heuristic to distribute the computation of the  
MultivariateGaussians, approximately when
 d is greater than 25 except for when k is very small. | 
public GaussianMixture()
public static boolean shouldDistributeGaussians(int k,
                                                int d)
MultivariateGaussians, approximately when
 d is greater than 25 except for when k is very small.k - Number of topicsd - Number of featurespublic GaussianMixture setInitialModel(GaussianMixtureModel model)
model - (undocumented)public scala.Option<GaussianMixtureModel> getInitialModel()
public GaussianMixture setK(int k)
k - (undocumented)public int getK()
public GaussianMixture setMaxIterations(int maxIterations)
maxIterations - (undocumented)public int getMaxIterations()
public GaussianMixture setConvergenceTol(double convergenceTol)
convergenceTol - (undocumented)public double getConvergenceTol()
public GaussianMixture setSeed(long seed)
seed - (undocumented)public long getSeed()
public GaussianMixtureModel run(RDD<Vector> data)
data - (undocumented)public GaussianMixtureModel run(JavaRDD<Vector> data)
run()data - (undocumented)