Text Summarization Branches Out

July 25-26, 2004

Workshop at ACL 2004 (42nd Annual Meeting of the Association for Computational Linguistics)

Forum Convention Centre, Barcelona, Spain

http://www.law.kuleuven.be/icri/conferences/acl_summarization2004.php

 

Text summarization is still largely in a research phase, and has so far focused on news text, but it is increasingly becoming a tool for information search and selection in a variety of media. For example, summarizing is a necessity when showing content on the screen of a mobile device. Texts integrated in multimedia documents have different genres or types, but they all require the same flexibility in the presentation of summaries by allowing parameterized compression rates and integration in a mixed media format.

Text summarization has been so far dominated by statistical techniques. However, for improved output quality and increased compression, other techniques are expected to play important roles as well. Linguistically motivated natural language processing techniques, including semantic analysis and discourse analysis, are almost certainly required for summarization in non-news genres. Automated reasoning techniques could allow fusion and understanding of content. Machine learning,  supervised or unsupervised, still has a major role to play. Finally, evaluation is an ongoing concern. The workshop aims to address all these issues.

Program of the Workshop

July 25, 2004

8:25 - 8:30              Welcome 

Session 1               Text Summarization Branches Out
(chair: Donna Harman)

8:30 - 9:30             Invited Lecture
Inderjeet Mani, Department of Linguistics, Georgetown University, USA
slides
slides (part2)

9:30 - 10:00            Extending Document Summarization to Information Graphics
                               Sandra Carberry, Kathleen McCoy and Daniel Chester
                               Department of Computer Science, University of Delaware, USA
                               Stephanie Elzer
                               Department of Computer Science, Millersville University, USA
                               Nancy Green
                               Department of Mathematical Science, University of North Carolina, USA
                              
10:00 - 10:30 Coffee break

Session 2               Evaluation: What Can We Learn from Humans?
(chair: Simone Teufel)

10:30 - 11:00          The Effects of Human Variation in DUC Summarization Evaluation
                               Donna Harman and Paul Over
                               Information Access Division, National Institute of Standards and Technology, USA
slides

11:00 - 11:30         
Paragraph-, Word- and Coherence-Based Approaches to Sentence Ranking:
A Comparison of Algorithm and Human Performance

                               Florian Wolf and Edward Gibson
                               Department of Brain and Cognitive Sciences,
Massachusetts Institute of Technology, USA

11:30-12:00            Vocabulary Usage in Newswire Summaries
                               Terry Copeck and Stan Szpakowicz
                               School of IT and Engineering, University of Ottawa, Canada
slides

 

12:00 - 13:50 Lunch (a
n open meeting to discuss plans for DUC 2005 and beyond)

Panel 1                  Text Summarization: A Look at the Last Decades
(chair: Eduard Hovy)



13:50 - 15:20
Donna Harman
                               Information Access Division, National Institute of Standards and Technology,USA
slides
Marie-Francine Moens
Interdisciplinary Centre for Law and Information Technology,
Katholieke Universiteit Leuven, Belgium
slides
Judith Schlesinger
IDA / Center for Computing Sciences, USA
slides
Hans van Halteren
                               Department of Language and Speech, University of Nijmegen,
The Netherlands

15:20 - 15:40 Coffee break

Session 3               Exploring Novel Horizons
(chair: Marie-Francine Moens)

15:40 - 16:10          Legal Texts Summarization by Exploration of the Thematic Structure
and Argumentative Roles

                               Atefeh Farzindar and Guy Lapalme
                               Département d'Informatique et Recherche Opérationelle,
Université de Montréal, Canada
slides
16:10 - 16:40          A Rhetorical Status Classifier for Legal Text Summarisation
                               Ben Hachey and Claire Grover
                               School of Informatics, University of Edinburgh, UK
slides
16:40 - 17:10          Task-Focused Summarization of E-mail
                               Simon Corston-Oliver, Eric Ringger, Michael Gamon and Richard Campbell
Microsoft Research, USA


Session 4               Branching-Out Applications
(chair: Daniel Marcu)

17:10-17:30            Hybrid Text Summarization: Combining External Relevance Measures
with Structural Analysis

                               Gian Lorenzo Thione, Martin van den Berg, Livia Polanyi and Chris Culy
                               FX Palo Alto Laboratory, USA
slides
17:30-17:50           Template-Filtered Headline Summarization
                               Liang Zhou and Eduard Hovy
                               Information Sciences Institute, University of Southern California, USA
slides
17:50-18:10           Handling Figures in Document Summarization
                              
Robert P. Futrelle
                               College of Computer & Information Science, Northeastern University, USA

July 26, 2004


Session 5               Evaluation: The Metrics
(chair: Dragomir Radev)

8:30- 9:00               Automatic Evaluation of Summaries Using Document Graphs
                               Eugene Santos Jr., Ahmed A. Mohamed and Qunhua Zhao
                               Computer Science and Engineering Department, University of Connecticut, USA
9:00- 9:30               ROUGE: A Package for Automatic Evaluation of Summaries
                               Chin-Yew Lin
                               Information Sciences Institute, University of Southern California, USA
slides
9:30-10:00              Evaluation Measures Considering Sentence Concatenation for Automatic Summarization by Sentence or Word Extraction
                               Chiori Hori, Tsutomu Hirao and Hideki Isozaki
                               NTT Communication Science Laboratories, Japan


10:00 - 10:30 Coffee break

Panel 2 Text Summarization: What Lies Ahead
(chair: Stan Szpakowicz)


10:30 - 12:00
Eduard Hovy
Information Sciences Institute, University of Southern California, USA

slides
Daniel Marcu
Information Sciences Institute, University of Southern California, USA
Dragomir Radev
School of Information and Department of Electrical Engineering and Computer Science,
University of Michigan, USA
Simone Teufel

                                Computer Laboratory, Cambridge University, UK
slides

12:00 - 13:30 Lunch

Session 6               Sentence Compression and Fusion
(chair: Judith Schlesinger)

13:30 - 14:00          Sentence Compression for Automated Subtitling: A Hybrid Approach                                Vincent Vandeghinste and Yi Pan
                               Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium
slides

 14:00 - 14:30          Generic Sentence Fusion is an Ill-Defined Summarization Task                                Hal Daumé III and Daniel Marcu
                               Information Sciences Institute, University of Southern California, USA

Session 7              Topic and Event Detection
(chair: Judith Schlesinger)

14.30 - 15:00          Event-Based Extractive Summarization
                               Elena Filatova
                               Department of Computer Science, Columbia University, USA
                               Vasileios Hatzivassiloglou
                               Center for Computational Learning Systems, Columbia University, USA
slides

15:00 - 15:30          Chinese Text Summarization Based on Thematic Area Detection
                               Po Hu and Tingting He
                               Department of Computer Science, Central China Normal University, China
                               Donghong Ji
                               Institute for Infocomm Research, Singapore


15:30 - 15:35 Closing remarks -- The Organizers
                              Eduard Hovy
                              Information Sciences Institute, University of Southern California, USA
                              Marie-Francine Moens (co-chair)
                              Interdisciplinary Centre for Law & Information Technology,
Katholieke Universiteit Leuven, Belgium                               Dragomir Radev
                              School of Information and Department of Electrical Engineering and Computer Science,
University of Michigan, USA                               Stan Szpakowicz (co-chair)
                              School of Information Technology and Engineering, University of Ottawa, Canada

15:35 - 15:50        Post-workshop coffee break

Program Committee

Contact addresses

Marie-Francine Moens
Interdisciplinary Centre for Law & Information Technology
Katholieke Universiteit Leuven
Tiensestraat 41
B-3000 Leuven
Belgium
marie-france.moens@law.kuleuven.be
http://www.law.kuleuven.be/icri/staff/staff.php?id=13

Stan Szpakowicz
School of Information Technology and Engineering
University of Ottawa
800 King Edward Avenue
Ottawa, Ontario
K1N 6N5
Canada
szpak@site.uottawa.ca
http://www.site.uottawa.ca/~szpak