PmWiki - Main - Doherty

Good article to quote for teaching and MT. Also suggests that besides shortening sentences editing the source text is of little help. GET O'Brien 2003 for more editing suggestions.

p. 296

 Despite the undoubted wisdom of Cronin’s and Engeström and Sannino’s positions, those

of us who teach translation technology are faced now with questions about what, and how much, we should include in teaching modules dealing with SMT.

p. 297

 In that source (=Kenny and Doherty 2014), we explain the rudiments of Statistical Machine Translation; how

SMT systems are created and evaluated and the extent to which they are used in professional practice; where humans (and especially translators) fit into SMT workflows; how translator trainers can overcome conceptual, ethical and technical challenges in designing SMT syllabi; and, most importantly, what we believe about the role of translators in SMT. In a nutshell, we believe that an SMT syllabus should be devised in such a way as to be holistic and empowering for translators.

p. 301

 The first challenge faced by students in the completion of their assignments was the

acquisition of enough high-quality aligned bilingual data to train an SMT system. This was done under the guidance of the lab instructor. In most cases, students used the DGTTM, the multilingual translation memory made available by the European Commission’s Directorate General for Translation through the Commission’s Joint Research Centre (for details see Steinberger et al. 2012).

p. 302

 our students had been instructed in best practice in human evaluation

of MT output, and they had available to them a pool of fellow students who could have acted as evaluators for outputs from their engines. In practice, students tended to rely on a single scorer (themselves) to judge fluency and adequacy of all outputs, thus compromising the reliability and validity of their human evaluations. We return to this point below. In summary, 21 of the students (or 55%) opted for adequacy/fluency scoring to evaluate the MT output. Twenty-nine students (76%) sought to provide a typology of errors in translations produced by their engine, with a view to ascertaining whether there were any specific interventions they could make to eliminate some of these errors. The typology most commonly adopted by such students was that proposed by Vilar et al. (2006). Fourteen students used both adequacy/fluency scores and an error typology to evaluate outputs.

 students could choose from a number of possible

interventions in an attempt to improve the performance of their SMT engines. The most popular intervention proved to be editing the source text, often using controlled language rules such as those outlined in O’Brien (2003). Students who intervened in this way had usually already conducted an error analysis of the output from their SMT engine, in order

p.303

 to isolate particular problems in the target text. These students typically concluded that

most of their edits made little difference to the quality of SMT output when they retranslated their source text. The exception was shortening sentence length, an edit that usually had a positive impact.13 The second most popular intervention was the uploading of a document-specific glossary, an intervention that had an almost universally positive impact. Other possible, and arguably more promising, interventions – for example, uploading more training data – were, disappointingly, poorly represented in our sample. In total, 27 students implemented 2 main interventions (e.g. editing the source text and uploading a glossary), while 11 students were content with just a single intervention. On average, students carried out 1.6 interventions each.

Big View Attach:Site/PageHeader/softflow.gif Δ Home Edit History Recent Changes
SideBar current event translation projet korean movies ? IAICS PUBLICATIONS CLASSES 2023 Advanced writing Kumadai seminar 1 Japanese studies @Kenritsu graduate school american affairs graduation thesis currenteventsnew kaigai kenshu JCMU speaking OTHER honors III IV information processing computers and english intercultural communication notes for phonetics conferences precollege RESEARCH NOTES Notes for JPOP NOTES for development education Linguistic auditing MISC janes and baseball to do to order ordered kknwa autotranslation ? gen z TriadSkin powered by PmWiki	Main Doherty Good article to quote for teaching and MT. Also suggests that besides shortening sentences editing the source text is of little help. GET O'Brien 2003 for more editing suggestions. p. 296 Despite the undoubted wisdom of Cronin’s and Engeström and Sannino’s positions, those of us who teach translation technology are faced now with questions about what, and how much, we should include in teaching modules dealing with SMT. p. 297 In that source (=Kenny and Doherty 2014), we explain the rudiments of Statistical Machine Translation; how SMT systems are created and evaluated and the extent to which they are used in professional practice; where humans (and especially translators) fit into SMT workflows; how translator trainers can overcome conceptual, ethical and technical challenges in designing SMT syllabi; and, most importantly, what we believe about the role of translators in SMT. In a nutshell, we believe that an SMT syllabus should be devised in such a way as to be holistic and empowering for translators. p. 301 The first challenge faced by students in the completion of their assignments was the acquisition of enough high-quality aligned bilingual data to train an SMT system. This was done under the guidance of the lab instructor. In most cases, students used the DGTTM, the multilingual translation memory made available by the European Commission’s Directorate General for Translation through the Commission’s Joint Research Centre (for details see Steinberger et al. 2012). p. 302 our students had been instructed in best practice in human evaluation of MT output, and they had available to them a pool of fellow students who could have acted as evaluators for outputs from their engines. In practice, students tended to rely on a single scorer (themselves) to judge fluency and adequacy of all outputs, thus compromising the reliability and validity of their human evaluations. We return to this point below. In summary, 21 of the students (or 55%) opted for adequacy/fluency scoring to evaluate the MT output. Twenty-nine students (76%) sought to provide a typology of errors in translations produced by their engine, with a view to ascertaining whether there were any specific interventions they could make to eliminate some of these errors. The typology most commonly adopted by such students was that proposed by Vilar et al. (2006). Fourteen students used both adequacy/fluency scores and an error typology to evaluate outputs. students could choose from a number of possible interventions in an attempt to improve the performance of their SMT engines. The most popular intervention proved to be editing the source text, often using controlled language rules such as those outlined in O’Brien (2003). Students who intervened in this way had usually already conducted an error analysis of the output from their SMT engine, in order p.303 to isolate particular problems in the target text. These students typically concluded that most of their edits made little difference to the quality of SMT output when they retranslated their source text. The exception was shortening sentence length, an edit that usually had a positive impact.13 The second most popular intervention was the uploading of a document-specific glossary, an intervention that had an almost universally positive impact. Other possible, and arguably more promising, interventions – for example, uploading more training data – were, disappointingly, poorly represented in our sample. In total, 27 students implemented 2 main interventions (e.g. editing the source text and uploading a glossary), while 11 students were content with just a single intervention. On average, students carried out 1.6 interventions each. Edit Page History Source Attach File Backlinks List Group Page last modified on March 05, 2016, at 01:07 PM
skin config pmwiki-2.1.25

CLASSES 2023

OTHER

Doherty