We have implemented 3 focused crawlers : 1) Naive Bayes 2) SVM 3) COS similarity We have used the 3 focused crawlers as baseline and have implemented apprentice to be used on top of the Naive bayes baseline. We have used the idea in the Chakrabarti's "Accelerated Focus Crawling through online relevance feedback" paper to implement our apprentice. We used the same method as Chakrabarti to use apprentice. Evaluation we decided to use "A general evaluation framework for topical crawlers " by Filippo Menczer and Gautam Pant. We have decided that it is a better approach to use this framework then testing crawler using our own classifier.
Here are the links to our: