Clustering¶
The coclust.clustering
module provides clustering algorithms.
-
class
coclust.clustering.
SphericalKmeans
(n_clusters=2, init=None, max_iter=20, n_init=1, tol=1e-09, random_state=None, weighting=True)[source]¶ Spherical k-means clustering.
Parameters: - n_clusters (int, optional, default: 2) – Number of clusters to form
- init (numpy array or scipy sparse matrix, shape (n_features, n_clusters), optional, default: None) – Initial column labels
- max_iter (int, optional, default: 20) – Maximum number of iterations
- n_init (int, optional, default: 1) – Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_state (integer or numpy.RandomState, optional) – The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tol (float, default: 1e-9) – Relative tolerance with regards to criterion to declare convergence
- weighting (boolean, default: True) – Flag to activate or deactivate TF-IDF weighting
-
labels_
¶ array-like, shape (n_rows,) – cluster label of each row
-
criterion
¶ float – criterion obtained from the best run
-
criterions
¶ list of floats – sequence of criterion values during the best run
Spherical k-means¶
coclust.clustering.spherical_kmeans
provides an implementation of the
spherical k-means algorithm.
-
class
coclust.clustering.spherical_kmeans.
SphericalKmeans
(n_clusters=2, init=None, max_iter=20, n_init=1, tol=1e-09, random_state=None, weighting=True)[source]¶ Spherical k-means clustering.
Parameters: - n_clusters (int, optional, default: 2) – Number of clusters to form
- init (numpy array or scipy sparse matrix, shape (n_features, n_clusters), optional, default: None) – Initial column labels
- max_iter (int, optional, default: 20) – Maximum number of iterations
- n_init (int, optional, default: 1) – Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_state (integer or numpy.RandomState, optional) – The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tol (float, default: 1e-9) – Relative tolerance with regards to criterion to declare convergence
- weighting (boolean, default: True) – Flag to activate or deactivate TF-IDF weighting
-
labels_
¶ array-like, shape (n_rows,) – cluster label of each row
-
criterion
¶ float – criterion obtained from the best run
-
criterions
¶ list of floats – sequence of criterion values during the best run