Making of a Mistranslation (Yoneoka and Saito 2012)
The following section will present an example of what was meant to be human-enhanced machine translation, with somewhat disastrous results. Interestingly, it was easy to reconstruct the process with which the translation was created. Through this exercise, we can see what the automatic translator attempted to do, and provide guidelines for improvement of the process.
FIG X. from 2012autotranslation for proceedings
This sign at the Nasu ropeway station, an attraction in Nasu Highland Park, Tochigi Prefecture, Japan loosely means:
The ropeway gondola may stop operating in case of high winds, lightning, etc. In such a case, announcements will be broadcast. Please stay within hearing range of the broadcasts, and descend the mountain as soon as possible.
It is easy to see how the English translation was created. Compare the following:
(1) Japanese original: 強風・雷・等で運転休止なるおそれがある時に発令します。放送にて案内致しますので、聞こえる範囲内で行動の上、早めに下山ください。
(2) Excite translation:strong wind and thunder - etc. -- a plant shutdown -- when there is fear, it issues. Since I show around by broadcast, please Shimoyama gives a little early after acting within limits which can be heard.
(3) Final English translation: It issues, when a strong wind and thunder, etc. may make an opreation pause. Since I show around by broadcast, please get down from a mountain a little early after acting within limits which can be heard.
The translation process can be reconstructed in 2 steps:
Step 1. Cyclical automatic translation
First, the Japanese original (1) was fed into Excite, giving (2). This translation has three obvious errors with three different causes:
A. 下山 → Shimoyama
B. 運転休止 → a plant shutdown
C. おそれがある時に → when there is fear
A. is a case of synonymous words: 下山 meaning “descend the mountain” was mistranslated as the family name “Shimoyama”. B. is a problem of context: “plant shutdown” would be a correct translation within the context of a factory, but is not the correct term for a moving vehicle. C. is a literal translation of an expression of 恐れがある時 in Japanese; there is no “fear” involved in the possibility of the ropeway stopping.
In order to “correct” these errors, the human translator probably rephrased the original Japanese and did another translation lookup with Excite. Instead of 下山 [gezan], then, the phrase 山から下りる [yama kara oriru] produced “It gets down from a mountain” as in (3). Similarly, if 運転休止 is broken up into 運転 and 休止, excite translation yields the individual terms “operation” and “pause” respectively—exactly what is found in (3). Finally おそれがある時に may have been rephrased as the more literal を引き起こす可能性がある時, which was translated as “may make”.
The following is the result from the combined automatic translation and cyclical lookup process.
Modified Excite 1:strong wind and thunder - etc. -- operation pause-- when may make, it issues. Since I show around by broadcast, please It gets down from a mountain a little early after acting within limits which can be heard.
Step 2: Grammar Tweaking
In the first sentence, the word order is obviously incorrect. The human translator, knowing that English sentences follow SVO order, may have searched for a subject and found “it”. The clause beginning with “when” also requires SVO order, so the remaining translated phrases were chained together like puzzle pieces. Finally, the subject in the phrase “It gets down from a mountain” was omitted. These modifications yielded the following:
Modified Excite 2: It issues when strong wind and thunder etc. may make an operation pause. Since I show around by broadcast, please get down from a mountain a little early after acting within limits which can be heard.
This almost yields (3) – that is, the original English translation on the sign, except that the word “operation” was misspelled. This mispelling is clearly not due to any mistranslation, but is a common occurrence when the text changes hands from the translator to the signmaker.
The process outlined above may have lead the translator into a false sense of security that he or she has “improved” the computer output. As seen from the results, however, the outcome was not successful at all. Two further steps were needed: pre-editing and post-editing.
Preediting of the original Japanese using our recommended guidelines results in:
(4) Preedited Japanese: 強風と雷などがある時、ロープウェイが運転を止める可能性があります。その場合、警報を出します。私たちは放送でお知らせしますので、放送が聞こえる範囲内で動き、早く山を下りてください。
(5) English translation: When there are a strong wind, thunder, etc., a ropeway may stop operation. In that case, we take out an alarm and announce you by broadcast. Please move within limits which hear broadcast and get down from a mountain early.
Although (5) is far from perfect, it is already much clearer than the translation that was actually used in the sign. With some work on articles (a strong wind--strong winds, a ropeway--the ropeway, a mountain--the mountain) the translation is generally understandable, at least enough so that anyone reading it will be able to make out the meaning and avoid the fate of being left behind in dangerous, perhaps life-threatening circumstances.
Three post-editing options could also have been performed.
1. Back-translation
If we back-translate (3) into Japanese, we obtain the following
(6) Back translation of (3): それは出ます、いつ、強風および雷-など、オペレーション間を取ってもよい。私が放送によってまわりに示すので、聞くことができる範囲内に作用した後に山から少し早く降りてください。(rough translation by author: That comes out, when, strong wind and thunder, etc., can take during the operation. I will show you around by broadcast, so after using in the limits which can hear please get down from the mountain a little early.)
This back-translation highlights the lack of concrete subjects (What will stop operating? Who will broadcast?) in English, indicating that more editing of (2) is necessary.
2. Online bilingual corpus and dictionary site look-up
Bilingual corpus-based dictionary/collocation sites for English/Japanese such as Eijiro by ALC, a dictionary that provides examples and corpus results and Weblio, an online dictionary/corpus portal that includes over 600 specialty dictionaries and encyclopedias and 8 million word entries, may be used to provide hints as to how a word or phrase is actually used. (Tono 2010)
3. Online "native check" through social language learning sites
In online social language learning sites like lang-8, native speakers will edit posts by second language speakers. Fig. X below shows an entry written in (broken) Chinese, which was kindly corrected by 3 different native speakers, who added encouraging comments and pointers such as “Your Chinese sounds like it was written by a Japanese speaker”.
Fig. X
For further reference:
http://ejje.weblio.jp/sentence/
ALC SITE
www.lang-8.com