Problems with “traditional” IR Systems on the Web
According to Lawrence & Giles (Nature ‘99) search engines tend to have:
- Decreasing recall - decreasing coverage due to the rapid growth and dynamic structure of the Web
- Recall = Retrieved & Relevant / Relevant
- Low precision - huge lists of hits with inaccurate ranking
- Precision = Retrieved & Relevant / Retrieved
- Stale links and duplicated pages - due to their static nature it takes long time to refresh indexes