翻译是艺术,还是数学题?

2017-11-17 03:06吉迪恩路易斯克劳斯邓志辉
英语世界 2017年5期
关键词:要义语料语料库

文/吉迪恩·路易斯-克劳斯 译/邓志辉

翻译是艺术,还是数学题?

文/吉迪恩·路易斯-克劳斯 译/邓志辉

Universal translation1科幻作品往往默认存在某种universal translator(国内常译为“通用翻译器”或“万能翻译器”)帮助人类与外星人顺畅交流。universal translation乃由universal translator转化而来。此处译文适当进行了显明化处理。has long been motivated by a utopian ambition,a dream that harks back to Genesis, of a common tongue that perfectly maps thought to world2人类思维反映客观现实,语言则是传达思想的工具。若存在人类共同语,则能克服不同语言所具有的民族性,使全人类对于客观世界的认识(即思维)借助该共同语言得到毫无分歧的呈现。.

[2] Translation is possible, but we are bedeviled3bedevil使痛苦;虐待。by conflict. This fallen state of affairs is often attributed to the translators, who must not be doing a properly faithful job. The most succinct4succinct简洁的;简明的。expression of this suspicion55 suspicion怀疑。is “traduttore, traditore,” a common Italian saying that’s really an argument masked as a proverb. It means, literally, “translator,traitor,” but even though that is semantically on target, it doesn’t match the syllabic66 syllabic音节的;分音节的。harmoniousness of the origi-nal, and thus proves the impossibility it asserts.

[3] For now, efforts in the discipline of machine translation are mostly concerned with the dutiful assembly of “cargo trucks” to ferry information across linguistic borders. The hope is that machines might efficiently and cheaply perform the labor of rendering sentences whose informational content is paramount7paramount至为重要的。: “This metal is hot,”“My mother is in that collapsed house,”“Stay away from that snake.” Beyond its use in Google Translate, machine translation has been most successfully and widely implemented in the propagation8propagation传播。of continentspanning weather reports or the reproduction in 27 languages of user manuals for appliances. As one researcher told me,“We’re great if you’re Estonian and your toaster is broken.”

[4] Warren Weaver9(1894—1978)美国数学家,被誉为机译鼻祖,早于1947年就提出机译设想,1949年发表一份以“翻译”为题的备忘录,正式提出并详细阐述机器翻译问题。, a founder of the discipline, conceded: “No reasonable person thinks that a machine translation can ever achieve elegance and style.Pushkin10(1799—1837),俄国著名诗人。此处用以指代注重语言使用之elegance and style的诗人群体。作者的意思是,机器翻译无法实现优雅和风格,所以诗人们不必担忧会被机器取代。从翻译角度来看,直译为“普希金们”略嫌隐晦,“以普希金为代表的诗人们”则过于繁琐,所以只简单译为“诗人们”。need not shudder.” The whole enterprise introduces itself in such tones of lab-coat11lab coat实验室的工作服,这里引申为“简朴实用的,不加任何修饰的”。modesty.

[5] In 1960, one of the earliest researchers in the fi eld, the philosopher and mathematician Yehoshua Bar-Hillel12(1915—1975),以色列哲学家、数学家、语言学家,尤以在机器翻译和形式语言学中的成就闻名于世。,wrote that no machine translation would ever pass muster13pass muster及格;合乎要求。without human “postediting.” He called attention to sentences like “The pen is in the box” and “The box is in the pen14pen还有“围栏,关押”等义。.” For a translation machine to be successful in such a situation of semantic ambiguity15ambiguity歧义;一语多义。, it would need at hand not only a dictionary but also a“universal encyclopedia.” The brightest future for machine translation, he suggested, would rely on coordinated efforts between plodding16plodding老牛拖破车似的;做事慎重而呆板的。machines and well-trained humans. The scienti fi c community largely came to accept this view: Machine translation required the help of trained linguists, who would derive increasingly abstract grammatical rules to distill natural languages down to the sets of formal symbols that machines could manipulate.

[6] This paradigm17paradigm范例;典范。prevailed18prevail普遍存在;盛行。until 1988, year zero for modern machine translation, when a team of IBM’s speech-recognition researchers presented a new approach. What these computer scientists proposed was that Warren Weaver’s insight19指韦弗1949年在“翻译”备忘录中提出的观点,认为翻译过程类似于密码解读过程,故可从这一角度来进行机器翻译研究。about cryptography20cryptography密码学。was essentially correct but that the computers of the time weren’t nearly powerful enough to do the job.“Our approach,” they wrote, “eschews21eschew避开;戒绝。the use of an intermediate222 intermediate中间的。mechanism (language) that would encode the‘meaning’ of the source text.” All you had to do was load reams23ream〈非正式〉大量的文字(或写作)。of parallel text24parallel text平行语料,指使用不同语言撰写、相互间具有“翻译关系”的文本。through a machine and compute the statistical likelihood of matches across languages. If you train a computer on enough material, it will come to understand that 99.9 percent of the time,“the butterfly” in an English text corresponds to “le papillon” in a parallel French one. One researcher25指弗里德里克·贾里尼克(Frederek Jelinek,1932—2010),世界著名的语音识别和自然语言处理的专家,他在 IBM 实验室工作期间,提出了基于统计的语音识别的框架。本句所指原话有不同版本,其一是“Every time I fire a linguist, the performance of the speech recognizer goes up.”。quipped26quip讲俏皮话。that his system performed incrementally better each time he fi red a linguist. Human collaborators, preoccupied with shades27shade差别;不同。of “meaning,” could henceforth be edited out entirely.

[7] This statistical strategy, which supports Google Translate and Skype Translator and any other contemporary system, has undergone nearly three decades of steady refinement28refinement(精细的)改进,改善。. The problems of semantic ambiguity have been lessened by paying pretty much no attention whatsoever to semantics.The English word “bank,” to use one frequent example, can mean either “financial institution” or “side of a river,”but these are two distinct words in French. When should it be translated as“banque,” when as “rive”? A probabilistic299 probabilistic基于概率的;或然的。model will have the computer examine a few of the other words nearby.If your sentence elsewhere contains the words “money” or “robbery,” the proper translation is probably “banque.”(This doesn’t work in every instance,of course. A machine might still have a hard time with the relatively simple sentence “A Parisian has to have a lot of money to live on the Left Bank.”

[8] Many computational linguists continue to claim that, after all, they are interested only in “the gist300 gist要点;大意。” and that their duty is to fi nd inexpensive and fast ways of trucking the gist across languages.But they have effectively arrogated31arrogate僭称;霸占。to themselves the power to draw a bright line where “the gist” ends and “style” begins. Human translators think it’s not so simple. All texts have some purpose in mind, and what a good human translator does is pay attention to how the means serve the end, how the “style” exists in relationship to “the gist.” The oddity is that belief in the existence of an isolated“gist” often obscures the interests at the heart of translation.

[9] What mostly annoys human translators isn’t the arrogance of machines but their appropriation of the work of forgotten or anonymous humans. Machine translation necessarily supervenes on previous human effort; otherwise there wouldn’t be the parallel corpora32corpora语料库,指为特定的应用目标而专门收集加工、具有一定结构、可被计算机程序检索的原始语料集合。that the machines need to do their work. I mentioned to an Israeli graduate student that I had been reading the Wikipedia page of Yehoshua Bar-Hillel and had found out that his granddaughter, Gili,is a minor celebrity in Israel as the translator of the “Harry Potter” books.He hadn’t heard of her and didn’t seem interested in the process by which a

无障碍型通用翻译的灵感来源是一个令人联想到《圣经·创世记》的乌托邦式梦想,即借助某种共同语言,架构起人类思维与客观世界间的完美桥梁。

[2]翻译诚然可为,但分歧依然令人苦不堪言。这种不尽人意的现状常被归咎于译者——人们想当然地认为这必然都因他们未能忠实尽责所致。对此类不信任心态最言简意赅的表达,莫过于一句意大利谚语“traduttore,traditore”。这话貌似一句格言,其实只是一个观点,其英文直译是“translator, traitor”(“译者,叛徒”)。不过英译文虽然语义无误,音节上却无法再现意大利原文的对称和谐之美,因此倒恰好佐证了原句所宣称的翻译不可为之论。

[3]就目前来看,机器翻译领域主要是兢兢业业地以“货车”组装方式进行语际间信息传送,以期在翻译那些信息成分至上的句子时,机器可以更廉价而高效,例如:“这块金属很热”“我母亲还在那栋倒塌的房子里”“离那条蛇远点”等。除了谷歌翻译软件以外,机器翻译应用最成功、服务范围最广的领域,当属洲内天气预报的传播系统,或是家用电器使用说明书的27种语言翻译系统。如一位研究者所说,“如果你是爱沙尼亚人,而且面包机坏了,这时你会发现我们的服务相当不错”。

[4]机器翻译的鼻祖沃伦·韦弗曾经坦承:“但凡有点理智的人都清楚,机器翻译永远无法实现语言的优雅美感或风格的艺术再现,因此诗人们不必恐慌。”——整个机译行业都以这种朴实的语气自谦。

[5] 1960年,该领域的先驱之一,同时也是哲学家和数学家的耶霍舒亚·巴尔-希勒尔发文宣称:除非有人工译者进行后期编辑加工,否则机器翻译的质量绝对无法过关。他提醒人们注意一些歧义句,如:“pen(钢笔)在盒子里”和“盒子在pen(笼子)里”,机器在处理此类语义歧义时,仅靠字典尚不足够,还必须借助某种“万能百科全书”才行。因此,在他看来,机器翻译要实现最佳前景,必须依靠呆板的机器与训练有素的人工紧密合作,方有可为。科学界很大程度上逐渐接受了这种观点:机器翻译必须依靠专业语言学家的帮助,后者通过导出日益抽象的语法规则,将自然语言简化归纳为一套套正式符号,供机器识别使用。

[6]这一思维范式持续至1988年。在这个现代机器翻译技术的元年,来自IBM公司的一个语言识别研究团队展示了一种全新方法。这些计算机科学家提出,沃伦·韦弗当年从密码学视角将翻译视为“解码”过程的看法本质上没错,但受当时计算机技术的限制,从该思路出发无法实现机器翻译。他们写道:“我们的方法则避开了这一常规思路,不再依赖中介机制(语言)来对源文本的意义进行编码。”要做的只是通过机器载入大量平行语料,然后对语言间的对应情况进行统计分析即可。只要给计算机的训练语料库够大,它就会逐渐学习到,英语文本中的the butterfly在99.9%的情况下都与法语平行文本中的le papillon相对应。有位研究者曾打趣说,每开除一名语言学家,他的系统运行效率就会大幅上升。纠结盘桓于各种细枝末节“意义”间的人类伙伴似乎从此可以彻底退场了。

[7]这种统计法是如今谷歌翻译器、Skype翻译器及其他各类当代机译系统的技术基础,且自其问世30年以来,一直处于稳定改良中。语义歧义问题已有所减少,而解决方案居然是:彻底绕开语义。举一个大家熟知的例子,英语中bank一词同时有“金融机构”与“河岸”之义,而在法语中这两个意思分别对应两个完全不同的词。究竟何时该将bank译成法语的banque(银行)、何时该译成rive(河岸)呢?机译概率模型会指引计算机查看附近的几个单词,如果句中其他地方出现“钱”或“抢劫”等类词语,则可判断恰当译法很可能是banque。(这当然不适用于所有情形。“巴黎人得有一大笔钱才能住在左岸”,这个句子本身虽然并不复杂,计算机在翻译时恐怕却得颇费周章。)

[8]很多计算语言学家反复声明,称自己感兴趣的仅限于信息“要义”,职责是寻找低廉而快捷的手段实现语际间的信息要义输送。殊不知,他们在此过程中僭取了一个权利,即由他们来界线分明地判定何时“要义出”、何时“风格入”。人类译者则认为事情并非如此简单。所有文本都自带意图,而一名优秀人类译者所做的,恰恰是关注手段如何为意图服务、“风格”如何与“要义”互存。悖论就在于:相信某种“要义”能独立自存,这一看法反而遮蔽了翻译本身的核心要义。

[9]最让人类译者生恼的还不是机器的这种倨傲,而是它们对无名或匿名人士劳动成果的任意取用。机器翻译无可避免要仰赖此前的人类劳动成果,这也是为什么要建立平行语料库的原因,否则机器无法工作。我曾与一位以色列研究生交谈,说起我一直在读维基百科上有关耶霍舒亚·巴尔-希勒尔的介绍,了解到他的孙女吉丽是《哈利·波特》系列小说的译者,在以色列小有名气。这名学生对她一无所知,谈话过程中也没表现出对出版商花钱引进魔法类儿童读物过程的兴趣。但是,如果没有吉丽·巴尔-希勒尔这样的译者一字一句精雕细琢、为一个用途非凡的平行语料库做出4000余页的语料贡献,就不会出现支持希伯来语-英语互

〔〕

〔〕publisher paid to import books about magic for children. But we would have no such tools as Google Translate for the Hebrew-English language pair if Bar-Hillel had not hand-translated,with care, more than 4,000 pages of an extremely useful parallel corpus. In a sense, their machines aren’t actually translating; they’re just speeding along tracks set down by others. This is the original sin of machine translation: The field would be nowhere33 be nowhere 没有取胜的机会;一无所得。 without the human translators they seek, however modestly, to supersede34 supersede取代,代替。. ■译的谷歌翻译应用程序。在某种意义上,机器从未进行真正的翻译,而只是沿着他人铺设好的轨道飞驰,这正是机器翻译的原罪所在:若非借人类译者之功,机器翻译行业断不能有任何建树;然而它们尽管姿态极尽谦卑,却一心图谋要将人类译者取而代之。 □

(译者曾获第五届“《英语世界》杯”翻译大赛一等奖。译者单位:中山大学外国语学院)

Is Translation an Art or a Math Problem?

ByGideon Lewis-Kraus

猜你喜欢
要义语料语料库
落到实处是第一要义
基于归一化点向互信息的低资源平行语料过滤方法*
平行语料库在翻译教学中的应用研究
体育运动对青少年健康成长的三个“要义”
《语料库翻译文体学》评介
经典实用主义的要义
《苗防备览》中的湘西语料
国内外语用学实证研究比较:语料类型与收集方法
语篇元功能的语料库支撑范式介入
异种语料融合方法: 基于统计的中文词法分析应用