Reconfiguring the Translation Ecosystem in the 2020s

In our article Translation Economics of the 2020s in Multilingual Magazine I raised the question how sustainable the current co-existence of near zero-cost translation and paid-per-word translation would be. The increasing volumes of content combined with continued international business expansion make the economic pressure too high, for almost every company, to ignore the tremendous potential benefits of AI-enabled translation. A reconfiguration of the translation ecosystem is inevitable and is in fact already in full swing. In this article, I will elaborate on the emerging new business models from an economic perspective.

The Three Translation Production Factors

The man-machine dichotomy dominates almost all the debates when we talk about the future of the translation industry. Human intelligence versus machine intelligence. Of course, this distinction is too simplistic and doesn’t really help managers to make informed decisions. The shift to the new age of automatic translation requires a deeper analysis of the production factors in the process. Economically speaking we can distinguish three production factors: labor or human capital; technology, which in our case are the models or algorithms; and raw materials or resources, which in our case is the data. How big is the contribution of each of these three production factors in the translation process, how do we utilize and value them, and how scarce or abundant are these production factors?

New business models are being developed and offered. Big tech companies provide practically free translation that is relying entirely on their technology. Innovative language service providers and start-ups utilize the technology as best as they can, but keep relying on the factor they know best or can’t do without human capital. And now a new phenomenon enters the translation ecosystem: marketplaces where resources such as data can be traded and shared. The image shows the reconfiguration of the translation ecosystem with the colors indicating the prevalence of the production factors: blue for the human capital, red for the technology, green for the resources.

1-Slide-1 (1)

In the sections below we take a closer look at the traditional translation/localization model and four innovative business models. People, in general, are averse to change, but as we have seen in economic history, technological breakthroughs combined with the shortages and cost of resources lead to fundamental changes in ecosystems. Actors in traditional translation/localization like to hold on to their way of working and to the vendors they work with. In the transition to the ‘new world of AI’ they will adapt and let new technologies become part of their existing workflows. That is until a more radical break becomes inevitable and a new economic and business model needs to be implemented. Let’s look at what models are available and how pricing and licensing compare in these models.

Traditional Translation/Localization: the Owned Model

Traditionally translation/localization is a human labor service, purchased as work-for-hire and priced by the word, with assumptions of how many words a translator can translate and review per hour. Technology serves as an aid to these human translators, enhancing their productivity and leading customers to negotiate price discounts on a regular basis. Resources such as translation memories are a by-product protected by strict copyright laws. We call it the Owned model, because the customer controls the process through a tightly managed supply chain with a quality control system, style sheets and glossaries. This model comes at a price of €100,000 to €150,000 per 1 million words, which is staggering compared to the seemingly zero-cost translation delivered through machines.

Free MT Platforms: the Public Model

The MT services offered by the ten or so largest tech companies have caused the revolution in the translation ecosystem. The majority of the world’s population is familiar with Google Translate, Microsoft Bing, Yandex, Amazon, Alibaba, Apple, etc. and use these all the time to grasp the meaning of content in foreign languages, translate their letters, web pages and anything else you can think of. The total production output of the MT platforms altogether must be tens of thousands times bigger than the capacity of all human professional translators on our planet. That’s understandable because the platforms do not deploy human labor at all. Technology, once developed, does not incur a variable cost. That’s the secret behind the zero-marginal cost model advocated by Jeremy Rifkin. It’s the ideal model for executives who don’t look much further than spreadsheets. No surprise that more and more companies are tapping into these MT platforms, especially since some of them have started offering a customization service at prices as low as €30 to €60 per 1 million characters. That’s roughly 25,000 times cheaper than traditional translation, if you want to make the comparison. Yet, not every company will jump over. Their executives are rightfully concerned about the translation quality and even more perhaps about security. The awareness is growing that data is perhaps becoming the most important production factor in AI-enabled translation. The Public MT model may not provide sufficient protection for this valuable resource.

Build/License your Own AI/MT Platform: the Private Model

The first alternative to the Public model that large enterprises and governments may consider is the Private model: build, maintain and host your own MT infrastructure in-house, or license the technology from a dedicated MT supplier. Quality is still a challenge in this model, but data security and protection of resources are better controlled. Data will not leave the house or is hosted in a private cloud. However tempting it may be to own an MT infrastructure, even the largest organizations will think twice before they embark on such an ambitious project. Considering the heavy IT infrastructure, the investments in engineering and development as well as the skills required to keep the technology optimally operational, the total attracted cost at the end of the day may be prohibitive. Besides, the burden of managing human capital in the production process still rests on the shoulders of the enterprise. We see that only a few very large enterprises have chosen to build a private MT infrastructure. Others prefer to license the technology.

Human in the Loop: the Hosted Model

It is therefore extremely convenient to switch to one of these innovative Hosted model platforms. While in the traditional translation/localization model outsourcing is essentially limited to the human service tasks (customers buy their technology from other companies), new players like Lilt, Unbabel, Translated, Blend and Lengoo are inviting customers to sign up to their platforms for a tightly integrated offering of technology, data and humans in the loop, which allows them to offer a state-of-the-art real-time adaptive MT service at prices that are 50% to 70% below the traditional translation rates. Some customers may have concerns about data dependency and the risks of being locked in to a single vendor, while investors in the meantime bet that these AI-enabled translation suppliers will be among the winners. The five companies mentioned here have raised a total of $210 Million in growth capital in the last few years. And they are not alone: stories about start-ups and new investments in AI-enabled translation come out on a weekly basis.

Marketplaces: the Shared Model

There is one more alternative though: marketplaces. Think of the AWS ML Marketplace, or Hugging Face, the NLP community, or a start-up like aiXplain, and also of course the Systran Model Marketplace and the TAUS Data Marketplace. Marketplaces differ from the previous business models because instead of a one-to-one offering they are one-to-many offerings. With that, marketplaces may be the best match for the AI-driven reconfiguration of the translation ecosystem. In the traditional model, every project is unique and human capital prevails, and that is still how the Human-in-the-Loop service is being packaged and offered. However, in the new AI-enabled translation process the emphasis is shifting to data and models. Data and models can be recycled and repurposed to serve many more use cases and customers. This approach has the effect of challenging the uniqueness and creativity so deeply rooted in the DNA of the translation profession. Tech companies understand this very well and could in theory monopolize the translation sector, except they know that they cannot (and do not want to) excel in everyone’s field of expertise. That’s why they are adding the customization features to their MT platforms, and inviting their users to upload their parallel data and glossaries. We call marketplaces the Shared model, because data and models are shared and licensed on a royalty basis. Users of marketplaces can take the data and models and use them to optimize their translation workflows. Marketplaces may come at a surcharge compared to the MT platforms, but in return, they give the users more flexibility and independence.

Next step: Best Use of Human Capital

The landscape we have sketched above is a moving picture of course. Essentially, the big tech companies allow us to imagine that every company and every government can communicate fluently in all languages spoken by their customers and citizens wherever they are in the world. It’s now down to the business people and governments to build this future. The technology is there and can easily step up to process zillions of words in both text and speech at a marginal extra cost. The friction, however, is found in the human capital and resources (the data) vitally needed to boost the quality and reliability of the final versions. In the transition phase that we are in now, AI technology is being molded into existing processes, which in turn leads to a deglamorized and devalued role for the human translator. Constantly correcting imperfect machine output is not the best use of human capital. Another deficiency in the current state of our ecosystem is the uneven spread of data and the lack of attention and value given to data work. Everybody wants to do the model work, not the data work, as Nithya Sambasivan, the keynote speaker in the TAUS Data Summit 2021 says.

New, smarter models will therefore need to be configured. A new generation of innovators will find ways to make human capital and data more scalable in the translation ecosystem. In our next article in this series on Translation Economics of the 2020s, we will explore these topics further.