Similarity Measure
All document-cluster and cluster-cluster similarities are calculated using a variation of the cosine-vector approach:
We generate a lexicon of encountered vocabulary and refresh term weights for clusters every 100 documents.
Previous slide
Next slide
Back to first slide
View graphic version