WDA'2001 Proceedings

International Workshop on Web Document Analysis - WDA'2001

Co-Chairs
Apostolos ANTONACOPOULOS
University of Liverpool, UK

Jianying HU
Avaya Labs Research, USA

Program Committee
Henry BAIRD
    Xerox PARC, USA
Thomas BREUEL
    Xerox PARC, USA
Horst BUNKE
    Univ. Bern, Switzerland
Andreas DENGEL
    DFKI, Germany
David DOERMANN
    Univ. Maryland, USA
Dov DORI
    Massachusetts Institute of Technology, USA
Robert HARALICK
    City Univ. New York, USA
Rolf INGOLD
    Univ. Fribourg, Switzerland
Peter KING
    Univ. Manitoba, Canada
Koichi KISE
    Osaka Prefecture Univ., Japan
Nicholas KUSHMERICK
    Univ. College Dublin, Ireland
Yann LeCUN
    AT&T Labs, USA
David LEWIS
    Consultant, USA
Dan LOPRESTI
    Lucent Technologies, USA
Raymond MOONEY
    Univ. Texas, USA
Ethan MUNSON
    Univ. Winsconsin, USA
Cecile ROISIN
    INRIA Rhône-Alpes, France
Larry SPITZ
    Document Recognition Technologies, USA
Ah-Hwee TAN
    Kent Ridge Digital Labs, Singapore
Chew-Lim TAN
    National Univ. Singapore, Singapore
Christine VANOIRBEEK
    EPFL, Switzerland
Marcel WORRING
    Univ. Amsterdam, The Netherlands

Participants list

Sponsored in part by:

Proceedings of the

First International Workshop on
Web Document Analysis

(WDA2001)

Seattle, Washington, USA
September 8, 2001
(in association with ICDAR'01)

Message from the Co-Chairs

Panel Session: Web Document Analysis: Challenges and Opportunities

Tom Breuel, Xerox PARC, USA
Web document analysis: Applications and directions

Michael K. Brown, Metrocommute, USA
Web page analysis for voice browsing and the future pf Web documents for multimodal interfaces

Andreas Dengel, University of Kaiserslautern and DFKI, Germany
Position statement: Web document analysis employing knowledge acquired using mental models and user profiling

David Doermann, University of Maryland, USA
Web document analysis: Exploiting the Web

Rolf Ingold, University of Fribourg, Switzerland
Web document engineering and document image analysis

Yann LeCun, AT&T Labs, USA

Dan Lopresti, Lucent Technologies Bell Labs, USA
Web document analysis and information retrieval

Michael Rys, Microsoft Corp., USA

Presentation I: Content Extraction and Web Mining

An XML-based approach for the presentation and exploitation of extracted information
          M. Kunze and D. Roesner

Content extraction from HTML documents
          A. Rahman and H. Alam

Quality approach of Web documents by an evaluation of structure relevance
          A. Gagneux, V. Eglin and H. Emptoz

Web structure analysis for information mining
          V. Lakshmi, A.-H. Tan and C.-L. Tan

Query reformulation with collaborative concept-based expansion
          S. Klink

Library document analysis experiences for comprehensive search of the Web
          G. Sebestyen

Presentation II: Web as 2D Documents: Tables and Images

Layout and language: challenges for table understanding on the Web
          M. Hurst

A method to integrate tables of the World Wide Web
          M. Yoshida, K. Torisawa and J. Tsujii

Text extraction from web images based on human perception and fuzzy inference
          A. Antonacopoulos and D. Karatzas

To search for images on the Web, look at the text, then look at the images
          E.V. Munson and Y. Tsymbalenko

What fraction of image on the Web contain Text?
          T. Kanungo, C.H. Lee and R. Bradford

Presentation III: Web Document Modelling and Multi-Modal Access

Structured media for multimedia document authoring
          T.T. Thuong and C. Roisin

Applications of graph probing to Web document analysis
          D. Lopresti and G. Wilfong

Documenting system specifications through OPM/Web-based graphics/text equivalence: the case of the free flight system
          D. Dori

Web page analysis for voice browsing
          M. K. Brown, S. C. Glinski and B. C. Schmult

Towards a science of document intent
          S. Harrington, F. Naveda and R. P. Jones

Group Discussions

Web Content Extraction and Mining
          S. Klink and M. Hurst

Web Document Image Analysis
          Y. Wang and A.L. Spitz

Document Modeling and Multi-Model Access
          O. Hitz and M.K. Brown