MUG System Fission - Changelog David Reitter, dreitter at inf.ed.ac.uk ** This file describes overall changes in Fission ** Refer to the CVS logs of source files for more detailed information. ***************** 1.2 A fix to make it compatible with latest (>=5.5) versions of SWI Prolog. ***************** 1.1 small bugs in documentation (./fi instead of ./mug) took out deictics from big grammar better warning message if template arguments are bad assumes default scoring for unknown (new) modes simple-grammar now has output_mode/1 specification robust against multiple output_mode declarations updated one author's e-mail addresses implemented benchmarking in benchmark.pl ***************** 1.0 - release ***************** 03 June 2004 - many improvements to the workbench. Use of CSS. - unit tests available from workbench - SWI-Prolog >=5.3.8 required - better unification algorithm. demo2 gives additional results now. - much more documentation, source code refactored and tidied up - dialogue manager included (incomplete) - best first search (and best-only search) provided. change setting in fission.pl ***************** 10 Feb 2004 Over the past couple of days, Fission saw massive changes to its internal ongoings. These contribute to two main achievements: - Speed and memory optimizations - Scoring consolidation, improvements - Error checking ******************************************************************************************** Speed: ******************************************************************************************** - the code responsible for storing intermediate results ready to be used by the workbench has changed. (SWI specific.) It is now twice as fast, and uses half the memory. Running the maximum of 100 examples should be manageable again. - the central unifier was completely rewritten in order to allow for some optimizations such as minimum remaining values optimization. The optimization itself could not speed up the process (we're working on it...), however, the rewrite brought about a 50 percent speed improvement for the MUG application code (try fist/1 instead of fisd/1). This is evident in YAP as well, and across several big examples. Search techniques such as A* search (or at least Branch&Bound) are very hard to realize in Prolog. Given their known memory requirements, combined with Prolog's bad performance when copying large terms, it is unclear whether A*search would really improve things. - The scoring improvement (see below) speeds up the process further. Data: teested with SWI, development (debugging environment), fisd command and YAP, production environment , fist command tested on Feb7, 04 CVS main trunk release, Powerbook: SWI fisd, demo1, 64 variants, 12.34 CPU time, effectively 37.46 seconds (memory swapping) SWI fisd, ex2, 9 variants, 1.64 CPU in 2.09 seconds YAP fist, demo1, 2050ms YAP fist, ex2, 440ms performance with new revision, Powerbook: SWI fisd, demo1, 64 variants, 2.94 CPU time, effectively 4.98 seconds SWI fisd, ex2, 9 variants 1.03 CPU in 1.58 seconds YAP fist, demo1, 580ms YAP fist, ex2, 290ms tested on Feb8, 04 CVS main trunk release, Franka: YAP fist, demo1, 719ms YAP fist, ex2, 164ms performance with new revision (with MRV opt.), Franka: YAP fist, demo1, 203ms YAP fist, ex2, 104ms ******************************************************************************************** SCORING: ******************************************************************************************** - The utility score is now computed in two steps: before applying the MUG, an arithmetic expression is calculated (score:utility:X in the FDs). This way, we can be sure that every semantic entity is only scored once. - The scoring model functions for redundant information was improved - The scoring model has now device coefficients phi (see HLT paper) - Reading cost on screen is lower than before ******************************************************************************************** ERROR CHECKING ******************************************************************************************** - less screen output when we start ./fi - syntax errors in the code are marked red - MUG and KB warnings are yellow now - the KB serves as a type hierarchy for the FDs. FDs are not fully typed, but when equipped with a type attribute, they will be checked against the types defined in the KB. Attributes should be declared and typed in the KB. Undeclared attributes in typed FDs, values with non-matching types or FDs with non-existing types will lead to a complaint of the structure checking algorithm (warning only). FDs in the input and the MUG are checked like this. This feature is experimental. Warnings should never break the process. They can be resolved over time. ******************************************************************************************** UNIT TESTS ******************************************************************************************** - some unit tests produce output - most unit tests produce "malformed request". I investigated this and found that the "scope" attribute always (983 cases) is a prolog term like user_intention/email/bcc instead of a list containing the scope elements, such as [user_intention/email/bcc]. In 19 cases of several elements, it is given as a string, i.e. "task/email/to,task/email/cc,task/email/bcc" Giving a prolog string doesn't really make sense. In 5 cases, the syntax was scope: "task/email/to,task/email/cc,task/email/bcc,task/email/priority,task/email/body]", This bugs seem to result from a problem with the excel2mug script and the excel spreadsheets. My last revision of this left some irregularities. The output of fission was "malformed request" (as no MUG generation path could be found), which was correct for these cases. - assuming that we wanted to keep lists in the scope attribute instead of terms (i think we talked about this), I changed the excel2mug script accordingly and regenerated the auto test cases. - the sockets interface now outputs the request / session ID with its answer, which comes in handy when you want to trace a problem. Well, the IDs in the unit tests were 23784 all the time... so I changed that as well in the excel2mug program. - Some unit tests still don't work, and for the ones i have taken a look at, this is not because of technical problems with the engine, but rather missing components in the MUG or the like: changelist (see example confirm1) is not an action type in the MUG inform (see ex inform1) is not yet a task (dialogue act) type in the MUG To DO: - scoring, complementarity / redundancy bias, model explicitly