School of Informatics and Department of Computer Science
Department of Management Sciences
Another shortcoming of the disjoint processes of crawling/indexing on one hand, and querying on the other, is that the crawl process is not informed by the users. Since search engines cannot cover the whole Web, they make choices as to how to bias their crawling algorithms in favor of certain information resources over others. It would seem preferable to use information gathered from users to guide the crawling algorithms.
These factors point to a need for efficient topic-driven or personalized crawling algorithms. Such crawlers would not run into stale information, and would use knowledge about the topic or the user as context to interpret lexical and linkage cues during their search.