Two-Level Clustering
Primary clusters correspond to topic definitions
- Fixed at initialization of run
- Full (static) text vector (complete topic vocabulary)
- Positive / Negative example vocabulary Vectors
- Two options for primary adaptation
- “Pure” Rocchio scheme with modifications relating to incremental nature of adaptive filtering
- Differential Rocchio scheme where example vocabularies are distinct
Primaries act as ‘gatekeepers’
- The primary membership threshold allows for gross tuning of recall