Eagles Handbook on Spoken Language Systems

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Издательство Mouton de Gruyter, 1997, -1675 pp.
The technical production of this handbook has been a joint effort by several groups involving a large number of people, and the success of the coordination process in itself is by no means the least significant result of the EAGLES Spoken Language Working Group (see Chapter `User's Guide'), with particular credit to the fellow-members of the editorial team, Roger Moore and Richard Winski, for an inspiring and supportive style of collaboration.
The first group comprises the technical authors. All of them deserve thanks for their patience and cooperativeness, despite over-full schedules, heavy responsibilities and, in many cases, also the need to learn LaTeX in the process.
The second group includes the EAGLES support team, particularly the organisers in Pisa, Antonio Zampolli and Nicoletta Calzolari, with their untiring efforts to coordinate a somewhat unruly band of experts. Jock McNaught brought his editorial expertise in electronic publishing to bear on the initial layout design, and on solutions for a multitude of thorny problems.
The third group includes my team in Bielefeld, especially Inge Mertins, who put in a massive amount of work researching sources, re-formatting from a variety of source formats, taking care of complex style packages and spending month after month gently and effectively coaxing various authors to provide readable text, graphics and formulae. Thorsten Bomberg brought his expert knowledge of UNIX systems programming to bear on many technical problems; having found that available software did not scale up to handle a document of the size and complexity of this handbook, he specified and implemented a LaTeX to HTML conversion strategy which did work, and shared his results with the latex2html software developers, resulting in better software. Holger Ulrich Nord and Thorsten Trippel re-formatted the revised version in HTML, battling with many new format styles.
The fourth group is the publishing team led by Anke Beck at Mouton de Gruyter, whose professional standards forced us all to re-think many aspects of presentation and formatting, and on whose advice we were able to rely in designing a LaTeX document class to emulate the Mouton de Gruyter house style (though a couple of our own oddities remain).
The fifth group comprises those responsible at Directorate General XIII of the European Commission, Norbert Brinkhoff-Button, the project officer, and Roberto Cencioni, who deserve acknowledgment for their foresight, their willingness to be persuaded to take risks with this novel publishing venture for the field of spoken language technology, and above all for their patience in what must have seemed like an unending production story.
The main aim during the technical production process was to produce a high-quality handbook which on the one hand documents the core of standard good practice during the 1990s, and on the other hand presents a solid platform for further development. To attain this goal, a number of textual smoothing processes were required. The format conversion and formatting tasks have already been mentioned; English style and idiom in several chapters, by both native and non-native speakers, had to be considerably adapted for general readability and consistency. Many overlaps were removed, many additional details incorporated, cross-references to other chapters and the other EAGLES Working Groups were included, copyrights (for instance for electronic IPA versions) were negotiated, and additional appendix materials were elicited. Some of the appendices were specially written for the handbook, but most were generously provided by other European Commission funded projects, notably the SAM project, and were left unchanged apart from the necessary re-formatting. In certain areas, for instance, with corpus copyrights and with clandestine recording, legal and ethical issues arose, which could only be touched on in passing.
Recommendations are given explicitly in subsections in each chapter, and can thus be conveniently referred to by consulting the table of contents, which is deliberately kept rather detailed and is thus unusually long. The task of completely `homogenising' the style of recommendations proved to be too comprehensive at the present stage, however, partly because of the variety of recommendation types, and partly because of the different presentation styles of authors from different disciplines.
Since the original conception of the report four years before publication, the importance of the World Wide Web for research has expanded enormously. This has made the publication of sources for corpora and tools unnecessary: Web search engines can quickly find the up-to-date addresses. The second consideration which emerged shortly before the final production phase was the possibility of publication on the Web. The pros and cons of this were much debated, and criteria of overall portability, durability, robustness and convenience of paper versions (with library and paperback editions) scored over a purely electronic hypertext mode; in addition, the publisher is providing CD-ROM and, courageously, Web versions.
Despite all efforts, the handbook has a number of obvious shortcomings, and readers will no doubt collect their own selection of these. For the shortcomings I beg the readers' indulgence, and urge them to communicate their suggestions and thereby help to improve future versions of the handbook.
Part I: Spoken language system and corpus design
System design
SL corpus design
SL corpus collection
SL corpus representation
Part II: Spoken language characterization
Spoken language lexica
Language models
Physical characterisation and description
Part III: Spoken language system assessment
Assessment methodologies and experimental design
Assessment of recognition systems
Assessment of speaker verification systems
Assessment of synthesis systems
Assessment of interactive systems
Part IV: Spoken language reference materials
Character codes and computer readable alphabets
SAMPA computer readable phonetic alphabet
SAM file formats
SAM recording protocols
SAM software tools
EUROPEC recording tool
Digital storage media
Database Management Systems (DBMSs)
Speech standards
EUROM-1 database overview
Polyphone project overview
European speech resources
Transcription and documentation conventions for Speechdat
The Bavarian Archive for Speech Signals

Author(s): Gibbon D., Moore R., Winski R. (eds.)

Language: English
Commentary: 656633
Tags: Информатика и вычислительная техника;Искусственный интеллект;Компьютерная лингвистика