Abstract

Abstract

While quite a few linguistic corpora with syntactic annotations are available today, resources are scarce on the level of discourse annotation. A flexible, extendible annotation format speeds up development. We therefore propose an XML format for annotating rhetorical structure trees. In human and automatic analysis, rhetorical structure is often difficult and assigned incrementally. Thus, the format allows for underspecification. The paper discusses the various design decisions involved, illustrates the format with an example, and sketches some applications.

[Close window]