contact us | support Technology to Bridge the Language Gap
Products
| SpeechTrans |
|
|
|
|
|
The challenge of understanding spoken language AppTek has integrated Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) with Machine Translation. The product, SpeechTrans, allows for real-time dynamic speech-to-speech machine translation. It is deployed on computers, wearable machines, and telephony servers. The challenge of understanding spoken language was the main incentive behind the development of SpeechTrans, wherein AppTek has integrated its own Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) with Machine Translation (MT).SpeechTrans is designed for:1. Telephone-to-Telephone Machine Translation2. PC-based Dictation Systems 3. Handheld Devices SpeechTrans is a system that recognizes spoken utterances, e.g., in Arabic dialects, and translates them into text in English, which is then synthesized and output. The input is recorded through a telephone channel or microphone, and recognized using different ASR systems capable of recognizing either the source language or English, and tuned to either microphone- or telephone-quality speech. The recognized utterances are normalized using statistical MT based on finite state automata. The output is then translated by a hybrid MT, combining statistical and rule-based features. This hybrid Interlingua approach provides better results for speech input than a direct statistical MT. The Hybrid ApproachCompared with written language, speech (especially when spontaneous) poses additional difficulties for the task of automatic MT. Typically, these difficulties are caused by errors of the recognition process, which is carried out before translation. As a result, the sentence to be translated is not necessarily well-formed from a syntactic point of view. Even without recognition errors, speech translation has to cope with a lack of conventional syntactic structures because the structures of spontaneous speech differ from those of written language. A prime motivation for a hybrid MT system is to take advantage of the strengths of both rule-based and statistical approaches, while mitigating their weaknesses. Thus, for example, we want a rule that covers a rare word combination or construction to take precedence over statistics that were derived from sparse data (and thus not very reliable). Additionally, rules covering long-distance dependencies and embedded structures should be weighted favorably, since these constructions are more difficult to process in statistical MT. Conversely, we would like a statistical approach to take precedence in situations where large numbers of relevant dependencies are available, novel input is encountered, or high-frequency word combinations occur. An aspect that is extremely important in regard to the distillation engine is the weakness that statistical MT sometimes has in informativeness (the accurate translation of information) due to the influence of the target-language model. For example, single words that may make a disproportionately heavy contribution to informativeness, such as terms indicating negation or important content words, may be missing. Statistical MT ModuleOur statistical MT is a finite state transducer using alignment templates. Compared to traditional statistical MT systems, these methods have the advantage of being capable of learning translations of phrases, not just individual words, which permits the MT to encompass the functionality of example-based approaches and translation memories. The other advantage is that it allows for the combination of many knowledge sources by framing them as feature functions that are combined using a Maximum Entropy framework.Rule-based MT ModuleOur rule-based module employs a Lexical Functional Grammar (LFG) system. The LFG system contains a richly annotated lexicon containing functional and semantic information. It also produces richly annotated intermediate outputs that may interact with the statistical MT module:
|



