Text Globalization and Machine Translation (Case of English-Russian Translation of U.S. Travel Site)

The paper deals with the pre-editing process before it goes through machine translation. The main argument is to improve the source in order to improve the raw output quality. The mentioned process is investigated through the prism of the concept of a globalized text, thus making contribution to the research into “translation and globalization paradigm” being of interest over the past years. The content analysis of original and Google-translated texts of the U.S. tourism portal allows one to point out some principles of composing a global English-language text that would make the process of English-Russian translation more simple and productive. We conclude that the content writer should take into account certain peculiarities of the translation process, and they have been pointed out as well.


Introduction
Technological advances in translation studies provide wider opportunities for dealing with various translation tasks. Machine translation turns out to be one of such advances, which has constantly been improved and upgraded through the enhanced programs of statistics analysis and data processing. Such popular services as Babelfish and Google Translate are often used to meet the communication needs of the globalized world community, in particular on the Internet platform. Content of international websites as a translation object provides information to customers from all over the world, thus making the translator face the challenge of content adaptation. This challenge lies at the core of such widely known phenomena as localization and globalization -the processes aimed at ensuring availability and readability of information for large audiences from all over the world. While a large number of international websites have been successfully localized over the last decade, the concept of a global or globalized text requires much attention at the current stage of multicultural communications taking place in different contexts and formats. The key thesis of the present research: an English-language text, which fulfils the functions of the global text within the mono-or multilingual website space, should be composed in a manner allowing for its further correct machine translation. In this connection, the purpose of the research consists in pointing out the principles of composing a global English-language text which would serve its pragmatics functions even through the use of online translation tools by foreign readers. The relevance is determined by the need to make up such tourism texts and post them on international travel websites under the mark 'Global Site'.

Website Global Version
The tourism sector is characterized by great importance from the viewpoint of economic development, and the information logistics of this sector needs linguistic interpretation and guidelines. Tourism websites, travel portals in particular, represent a good example of the global resource for travellers from all over the world. Along with several localized versions, websites may have a special language option marked 'Global Site' or 'International Site' aimed at audiences not included into a localization range due to some reasons, namely, high costs and the audience range being too diverse to adapt for each one. A large number of official online portals (for example, travel portals of Poland, Great Britain, Norway, New Zealand, Spain, Ireland etc.) have a Global English version, and even the larger number needs one. However, the knowledge on creating a global resource is scattered and has not yet been actually applied. The task of making up tourism texts, which may be addressed to large audience, a "global addressee", seems to be rather difficult since up-to-date the status of a global addressee is still under questionwe do not know exactly what their level of proficiency in English, what countries they come from (countries of inner, outer or extending circles) [1]. Modeling a global text contributes to the development of a communication & pragmatics paradigm in the field of translation studies. The inclusion of the globalization process into the practical tasks of a translator seems to be one of the urgent tasks in the conditions of expanding global village. Moreover, revealing certain principles of text globalization provides tools for editing and translation activity with regard to a number of online and print discourses.

Machine translation of website content
Enthony Pym substantiates the idea that the global text should be suitable for machine translation since this is a way to ensure its readability [2: 180]. The author argues that when a document has a limited number of syntactic structures and a completely controlled multilingual terminology, machine translation plus reviewing can make the translation process almost automatical. We are aware that even the most accurate machine translation technology can't automatically account for branded content, terminology preferences, formatting considerations, subject-matter, cultural issues, and other important factors [3]. The typical mistakes of machine translation done by Google Translate and Prompt were described by I.A. . They include improper use of articles, wrong translation of words and word combinations, use of original punctuation marks, grammatical mistakes, word for word translation, use of personal pronouns in impersonal constructions, etc. [4]. None of machine translators is able to translate the whole presented text correctly, the greatest challenge being the Russian case pattern [5]. Perhaps the most promoted in Russia nowadays Google Translate is still far from being perfect giving rise to complaints of professional translators but helps unsophisticated users to understand text fragments in an unknown language at large [5].

Literature review
The concepts of Global English and World Englishes got much attention in Western research on incorporating varieties of English into education patterns and ELT in particular [6]. It is a global language with a global ownership [6], and this is the very meaning of the word 'global' which is often implied in diverse studies and contexts. According to B. Seidlhofer, the concept of International English rather means the use of English on the international scale, than a certain language variety [7].
There is also a specific research area dealing with the concept of 'global text' originating from developments in the field of English language simplification. The simplified versions of English include Ogdens' Basic English, Jean-Paul Nerrière's Globish, as well as Plain English and Minimal English. Texts written in either of these versions are meant to simplify the process of information decoding by non-native speakers of English. Basic English is a simplified subset of regular English created by linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a second language [8]. Jean-Paul Nerrière's Globish is a trademarked name for a subset of standard English grammar, and a list of 1500 English words. Nerrière claims it is "not a language", but rather it is the common ground that non-native English speakers adopt in the context of international business [9]. The Plain English Campaign (PEC), established in 1979 and based on the principle of everyone having access to clear and concise information, helped many government departments and other official organisations with their documents, reports and publications [10]. Today the campaign got wide popularity within the special webplatform offering a large number of commercial services on editing English texts. The concept of Minimal English was introduced by Anna Wierzbicka in 2014 to denote a radically reduced 'Mini-English'an interlanguage, a global means for discussions and information exchange [11: 194]. Cliff Goddard points to cross-translatability as a key distinction between 250 words of Minimal English and all the rest simple English models [12].
It should be noted that these conceptions consider the global language and the global text within the scale of one language -English. They got further interpretation and development both in foreign and Russian studies. John Kohl gives guidelines how to write documentation that is optimized for non-native speakers of English, translators, and even machine-translation software [13]. Russian researchers analyze Internet communication [14][15][16] proceeding from the need to make an international website play the role of a multicultural platform by means of global text options.

Materials and methods
The focus on the two-facet nature of the globalization process covering both editing an English text and translating from English is of special importance, particularly if we take into consideration the problem of working out a multilingual tourism website. The efficiency of machine translation technologies depends on obligatory presettings and initial text quality [5], and for many language pairs, we are now at the stage where it is quicker to "post-edit" machine-translation output than to start translating from scratch [17; 18]. As shown in [19], global versions of tourism websites rarely conform with the existing techniques of English text adaptation, which, by the way, proved to be effective in the process of translating texts of online tourism discourse widely represented on numerous web portals to different cities and countries.
The initial stages of ongoing research on the attitude of Russians to English-language online content let reveal the trend of using Google Translate services instead of further search for the necessary information in Russian.
In the present research we analyze the U.S. tourism portal VisitTheUSA.com (https://www.visittheusa.com/) the official travel site of the USA which has several language options -Global English, English (U.K.), English (Australia), English (Canada), English (India), French (Canada), Portuguese (Brazil), Sweden, French (France), German (Germany), Korean (Korea), Spanish (Chile), Spanish (Colombia), Spanish (Mexico), Japanese (Japan), Simplified Chinese (China). The content analysis shows that the Global English version is the primary one in the site structure and is supposed to be 'global' that is simple, clear and easy for translation by non-native speakers. We will focus rather on "preedit" machine translation and analyze the translation mistakes that can be prevented by means of creating texts in Global English specially designed for further translation into the Russian language via Google Translate.

Results
Translation of texts from section Destinations. Alyaska (https://www.visittheusa.com/state/alaska) let reveal some common Google mistakes. Table 1 represents the fragments of original, adapted and translated versions of the text The Alaskan wilderness: America's final frontier. According to Collins' dictionary [20], the pronoun 'nowhere' has 9 shades of meaning in different contexts, 'in, at, or to no place; not anywhere' being the primary one. This means that we should edit the phrase by replacing the negative grammatical form 'like nowhere else' with a positive one. The presented adjectives (unique / exceptional / one-of-a-kind) preserve the original proposition of exceptional nature. As we can see from Table 1, testing the adapted options in Google Translate proves the efficient decoding of sense. The illustrated negative-for-positive substitution coincides with one of the Plain English principles -'Write in the positive' implying that positive sentences are shorter and easier to understand than their negative counterparts (A Plain English Handbook). Table 2 represents the fragments of original, adapted and translated versions of the text Watch the Wildlife. Conversion as the derivation of a new word without any overt marking, in particular noun to verb and verb to noun, is one of the many ways to create new English words on the basis of already existing ones [21: 134]. The above-mentioned fragment demonstrates the mistake of speech part identification in the process of machine translation: the original 'spawn' is used as a verb, while the translated lexeme functions as a noun. This implies that the global text should be structured so as to exclude such variance. For example, pronoun 'which' may be used before an ambiguous word (see adapted fragment 1). We may also point to deviance in grammar structure caused by punctuation mistakeabsence of comma after 'traffic jams' in the original phrase. Inserting comma between two clauses recovers the actual sense (see adapted fragment 2).   The analysis of the given fragment allows one to point out the following words and word combinations, which should be corrected. According to Wayne Magnuson Dictionary of English Idioms, Sayings and Slang [22], 'take a drive' is an idiom with a meaning 'go for a drive, go on a trip', and the best option, in our view, is to use the heading 'Go on a breathtaking trip', which is translated by Google with a tempting offer -'Otpravtes v zakhvatyvayushchuyu poezdku'. Abbreviation RV requires decoding in the English text -'research vessel', since the translation of such lexemes represents one of the most urgent issues of machine translation, particularly when we talk about specialized abbreviations which are not characterized by frequent use. The translation of the verbs 'fly' and 'rent' with nouns 'vylet' and 'arenda' illustrates the highly mentioned problem of determining the part of speech by translation program. In this case this problem may be solved by using a pronoun and a modal verb (You can fly and rent).
As we can see from Table 3, National "Scenic Byways" is not translated by Google due to the use of quotation marks and capital letters. Elimination of these markers allows for the correct translation (see adapted fragment). Table 4 represents the fragments of original, adapted and translated versions of the text Visit the Country's Largest National Park.

Alaska includes 17 national park areas Alyaska vklyuchaet v sebya 17 natsionalnykh parkov
Using the word 'home' in the given context certainly intensifies the journalism style of the original text, but this intensifier turns out to be undeciphered by Google Translate. In our opinion, a globalized text should mostly consist of lexical units excluding the problem of sense ambiguity in machine translation to a maximum possible extent. Talking about the location of parks in Alaska, we may consider such verbs as locate, have, include, etc. (see adapted fragments 1, 2, 3). Table 5 represents the fragments of original, adapted and translated versions of the text Cruise to a Glacier. A special role in the process of translating tourism texts is played by linguistic and cultural adaptation of microtoponyms -proper names of human-created environment objects, such as squares, parks, museums, theaters and other cultural facilities, which create the attractive image of a city or a certain tourist destination [23]. As we can see from Table 4, Google Translate misrepresented both the meaning and the original form of the three given microtoponyms. We have already revealed that Google does not translate the words in quotation marks (see Table 3), therefore the best option in this case would be to place the microtoponyms in quotation marks, thus preserving their original English spelling (see adapted text fragment in Table 5). These words will surely not be understood by Russian readers, but will ensure easy search for more detailed information about the destinations. Moreover, the fragments of local languages bear the concept of personal enrichment [24], and a function as original markers of 'otherness' [25: 25].  Table 6 represents the fragments of original, adapted and translated versions of the text "Explore the Wilderness".
The given fragment illustrates the need to use more common words and word combinations (backcountryremote; lodge -house, fly-fishing -fishing) and to test gerund forms since Google Translate is not able to preserve their grammatical meaning in some cases.

Conclusion
In general, we may note a relatively high level of Google translatability of the analyzed texts: only 20% of the whole text corpus needs post-editing or, in our case, preedit translation. The content analysis of original and Google-translated texts of the U.S. tourism portal let point out some principles of composing a global Englishlanguage text that would make the process of English-Russian translation more simple and productive. In case an English-language website has no localized Russian version, but it is implied that Russian readers may be interested in the content, then the content writer should take into account the following peculiarities of the translation process: negative-for-positive substitution makes sentences clearer and easier to Google translate; the conversion as a word-formation model should be taken into account to avoid the variance in part-of speech decoding by machine translator (for example, in order to highlight the verb grammatical meaning, pronoun which or modal verbs may be used: spawnwhich spawn; flycan fly); it is necessary to check the punctuation of the original text to avoid mistakes; -Google Translate preserves the original spelling of words in quotation marks and capitalized words. Therefore, when we need the equivalent translation, we should exclude these markers; in some cases, preserving the original spelling of words by the abovementioned means may be productive, microtoponyms being a good example since they are hard for machine-decoding; abbreviations should be explained; words in transferred use create ambiguity of meaning in the text of translation. Actually, the analyzed process of editing English texts aimed at making them appropriate for machine translation is characterized by creative character and may imply a number of possible variants each ensuring the correct further translation. The proposed technique of adjusting original texts to Google Translate and other translation services may also be used with regard to other languages.
The reported study has been funded by the Russian Foundation for Basic Research (RFBR) and the Government of the Volgograd region. Research project No 17-14-34001 "Regional tourism as a factor of discourse and translation technology formation: nominative and communicative-pragmatic conventions of text as a branding tool" (Regional contest "The Volga Lands in the Culture and History of Russia").