Contact: David Reitter, University of Edinburgh.
Keywords: Language Technology, Computational Linguistics: FASiL, Customer Relationship Management, Discourse Marker Lexicon, Discourse Analysis, Discourse Representation Theory, Rhetorical Structure Theory, Prolog, Linguistics
Why is it so easy for us to produce complex natural language sentences? It is known that speakers seem to cheat a little when they come up with complex structures. They have a tendency to repeat, e.g., linguistic structures such as syntactic rules or lexicon entries rather than making new, independent decisions. When asked (using a passive voice construction): "Was Nero assassinated?", you will likely reply: "I don't think Nero was killed by anyone", but probably not "I don't think anyone killed Nero".
The phenomenon often called "priming" allows insights into the various structural units involved in natural language processing. Priming seems to be controlled by various factors, such as cognitive load and intended audience. In my thesis, I explore these factors and use them to build a model of linguistic choice-making. I present evidence for the hypothesis that priming is a result of two basic cognitive principles: learning and contextualization.
Structural priming (Bock 1986) is the tendency to repeat syntactic and other linguistic choices, rather than to make such choices from scratch. Priming is an effect of preceding context. I have developed a method to measure two kinds of structural priming effects from corpus data: a short-term effect, which decays within a few seconds, and a long term adapatation effect.
I show that priming levels differ under different circumstances, and that has grave consequences for theory of language production (and comprehension).
First, short-term priming is stronger in task-oriented dialogue than in spontaneous conversation, and secondly, stronger long-term priming is correlated with task success: pairs of speakers who are more successful at a task that requires communication, adapt more to each other's linguistic choices. Both of these findings are explained by Interactive Alignment Theory (Pickering and Garrod, 2004), whcih claims that interlocutors build a common situation model based on lower-level linguistic priming. Without the task, situation models and linguistic priming are not crucial to the dialogue.
The next step is to examine the role of syntax: what exactly is primed? Recent syntactic theories propose to encode the majority of language-specific information in the lexicon. For structural priming, this is an attractive model, in particular since lexical and syntactic priming have been shown to interact (e.g., Branigan and Pickering 1998). If priming is able to distinguish lexical (learned) categories from phrasal ones, we can support such models.
Syntactic frameworks make explicit assumptions about the structures used to process sentences. Linguists commonly evaluate grammar theory by its accuracy in predicting whether a sentence is acceptable (or grammatical). The psycholinguistic reality of grammar theories can now be tested using large collections of data (corpora) and methods such as structural priming, which can be assumed to directly apply to processing units, i.e. parts of syntactic structures predicted by a theory.
I evaluate priming with respect to phrase-structure grammar and to a lexicalized formalism, Combinatory Categorial Grammar (CCG, Steedman 2000). I find that priming can be modeled computationally as enhanced lexical access to syntactic categories, which specify subcategorization frames. This holds even if the processor is assumed to proceed incrementally, which is an important property when it comes to the psycholinguistic realism of such a formalism. Secondly, however, I find empirical evidence in corpora that transition-based models (on bigram levels) are insufficient to capture priming effects: priming is sensitive to syntactic structure, and not just to transitions between word categories.
In consequence, I spell out my view of structural priming in language production in a model within the general cognitive framework of ACT-R (e.g., Anderson, 1993). The model generates basic sentences ("The policeman gave a flower to the girl"), given a semantic description. It decides which syntactic structures to combine to come up with these sentences, using CCG as syntactic framework. Priming is modeled as a combination of spreading activation (contextualization), causing short-term priming, and base-level learning of syntactic nodes, causing long-term adaptation. The evaluation shows that priming in the ACT-R model mimics that found in corpora, also with respect to frequency effects (rare material primes more) and decay.
(December 2007) (With Frank Keller and Johanna Moore.)
We present the Embra system, a first-time entry to DUC for 2005 which performed at or above median for the manual assessment of responsiveness and on 4 out of 5 linguistic quality questions. The system takes a novel approach to relevance and redundancy, modeling sentence similarity using a latent semantic space constructed over a very large corpus. We present a simple approach to modeling specificity based on named entities which shows a small improvement over baseline. Finally, we discuss coherence and present a sentence reordering algorithm with a component-level evaluation demonstrating a positive effect. A key task in an extraction system for query-oriented multi-document summarisation, necessary for computing relevance and redundancy, is modelling text semantics. In the Embra system, we use a representation derived from the singular value decomposition of a term co-occurrence matrix. We present methods to show the reliability of performance improvements. We find that Embra performs better with dimensionality reduction. (With Ben Hachey and Gabriel Murray.)
![]() Generated multimodal output, Voice Output: "Send it now?" |
UI on the Fly is a technique that allows a computer to automatically generate multimodal user interfaces, in particular for small computers uch as cell phones or iPAQs. We enable these devices to engage in natural language conversation, using the touch-screen and voice out- and input at the same time. The output is tailored to the particular usage situation (in a restaurant, in the car, at home) as well as to the device and preferences of the user. The central system can thus remain blissfully agnostic as you switch from using a phone, to a PDA, to a computer, and back.
Technically, we formulate a hybrid natural language generation approach as a constraint optimization problem with hard and soft constraints. Multimodal Functional Unification Grammar is a formalism based on the unificiation of attribute-value-matrices. It enforces cross-modal coherence in the output. A graphical workbench allows us to maintain and debug grammars. The generation system has been evaluated positively with users, who judged it to be more efficient and showed a trend to perform better at a recall-task.
Detailed information about MUG Workbench and the system, and Open Source download
(Developed at MIT Media Lab Europe, 2002-2004, with E. M. Panttaja, F. Cummins, and others.)
![]() Virtual Personal Assistant (Mockup). Voice Input: "Show me this one!" |
When humans communicate vis-a-vis, they employ a variety of channels to convey meaning. Speech is not necessarily the primary mode. It is supplemented or replaced by facial gestures, body posture, gestures (deictic, iconic). Can we use these channels to communicate with machines? How can various signals from the different channels (that a human computer user may employ) be integrated to form a unique meaning representation, i.e. a computer command? Can a computer automatically and adaptively display content in a coherent and concise fashion? These questions are part of the research work done in FASiL. The focus of this two-year project is to produce a conversational language engine, demonstrated in three languages: English, Swedish and Portuguese. Important objectives for the EU are to ensure that the end system will be useful both in terms of functional requirements as well as being inclusive to all citizens - including hard of hearing and visually impaired people.
![]() Example for a rhetorical relation |
Most text displays an internal coherence structure, which can be analyzed as a tree structure of relations that hold between short segments of text. We present a machine-learning governed approach to such an analysis in the framework of Rhetorical Structure Theory. Our rhetorical analyzer observes a variety of textual properties, such as cue phrases, part-of-speech information, rhetorical context and lexical chaining. A two-stage parsing algorithm uses local and global optimization to find an analysis. Decisions during parsing are driven by an ensemble of support vector classifiers. This training method allows for a non-linear separation of samples with many relevant features. We define a chain of annotation tools that profits from a new underspecified representation of rhetorical structure. Classifiers are trained on a newly introduced German language corpus, as well as on a large English one. We present evaluation data for the recognition of rhetorical relations.
Please find the document type definition grammars and several tools to convert (LDC corpus, O'Donnell's RS3) and access URML data here.
More about the Potsdam Commentary Corpus can be found here.
Drawing rhetorical analyses is no fun when you need to change and update diagrams as you refine your work, or, more importantly, if a lot of analyses are to be drawn. Voila, there we go: This package enables us to typeset beautiful diagrams with no hassle. It is oriented towards the style of the diagrams shown in Mann&Thompson's Rhetorical Structure Theory and subsequent works. This package works perfectly with (LaTeX) and pdfLaTeX and does not require any special postscript capabilities in the output side.
Version: 1.3 dated 17-Feb-2003
We describe our ongoing work on an application of XML/XSL technology to a dictionary, from whose source representation various views for the human reader as well as for automatic text generation and understanding are derived. Our case study is a dictionary of discourse markers, the words (often, but not always, conjunctions) that signal the presence of a disocurse relation between adjacent spans of text.
CyMON-NLU can inform, chat and gather user information using an advanced natural language understanding engine. It combines statistical morphosyntactic disambiguation methods (trigram tagging), a stemming algorithm and a robust parser for a large semantic grammar implemented in an XML formalism. The scalable CyMON-NLU engine is implemented in C++ and provides interfaces to the agent-based CRM platform CyMON. Further features include automatic language detection and dialog tracking using a semantic network interface. A development kit enables language engineers to easily create semantic grammars for the specific domain.
I have been in charge of the development of CyMON-NLU at Agentscape AG, Berlin and its daughter Agentscape Romania SRL.