Questions and Answers on the Web

The WWW is a unique resource for information - from grade-school homework to competitive intelligence. However, this resource is available at a price, the high cost of sifting through to find the right information. This talk presents some of the challenges faced in automatically reaching the relevant information and our use of agents in achieving this goal. Some of the features that are important are (1) to be able to utilize the services of the many search engines that already exist, (2) to be able to analyze web pages and group them into clusters that include meaningfully related pages, (3) to display the clusters of pages and the individual pages in a visually effective format and interface and (4) to extract answers from these clusters of documents. Additional dimensions include the ability to work with multimedia such as still images, video and sound. This talk explores these features and their underlying challenges as well as provide a demonstration of the current system.

Click here to start

Table of Contents

Questions and Answers on the Web

Motivation

Goals

Some Hard Data from the Web, circa 1994

Some Hard Data from the Web, circa 1994

A Fresh but Aging Web...

Sulla, version 1

Sulla, version 2

A Sherlock Example Specification, Part 1

A Sherlock Example Specification, Part 2

A Sherlock Example Specification, Part 3

Lexical Architecture

Clustering Architecture

Clustering Approach

Direct Manipulation Interface

TREC Issues for Clustering

Two-Level Clustering

Two-Level Clustering

Similarity Measure

Adaptive Filtering Parameters

Learning Effects, part 1

Learning Effects, part 2

Question Answering

QA Examples

Our Approach

Some Results (250 bytes)

Some Results (50 bytes)

Some Results (exact phrase)

Web Agent Futures

Author: Dave Eichmann
Library and Information Science
University of Iowa

David A. Eichmann is an Assistant Professor of Information Science at the University of Iowa. Before joining the School of Library and Information Science (jointly with the Department of Computer Science) he was chair of the Software Engineering program at the University of Houston - Clear Lake and Director of Research and Development of the Repository Based Software Engineering Program (RBSE). Funded by NASA and working closely with Johnson Space Center's Software Technology Branch, RBSE collaborated with Rockwell Space Operations Company on the reengineering of a major portion of the Space Shuttle Flight Analysis and Design System. The Web-based MORE repository system developed as part of RBSE was nominated by JSC for the 1998 NASA Software of the Year Award. Dr. Eichmann is a graduate of the University of Iowa, where he received his Ph.D. in computer science in 1989. His research has been in the areas of Web-based information systems, database systems, with an emphasis on type theory and abstraction, and software reuse/reengineering. The interplay between these areas formed the basis for much of his work in agents and repositories.