TAUS Releases V2 of Estimate API, Featuring Considerable Improvements for 5 EU Languages

29/07/2024
5 minute read

Discover the enhanced TAUS Estimate API V2, featuring improved accuracy for 5 EU languages. Save up to 60% on post-editing efforts and reduce translation risks with our extensively trained generic model. Learn more and start your free trial today!

 

The TAUS Estimate API was first released in October 2022. Since then, users of the Estimate API have indicated a savings between 25% and 60% on human post-editing efforts and costs, as well as helping to mitigate risks of bad translation output for high-volume users in an MT-only setting. A generic model is available off-the-shelf and via the API it can be easily integrated in existing platforms and content and translation workflows. This generic model has undergone comprehensive training sessions in order to release V2 of the TAUS Estimate API.

The TAUS Data repository (consisting of 7+ billion words in about 600 language pairs and domains) has been instrumental in training the TAUS Estimate API models. To improve the generic model, the NLP team at TAUS has pulled millions of sentences from the repository in the IT, Healthcare, Commerce, Legal and Business domains for English into French, Italian, German and Spanish and curated high-quality training sets. After extensive training and analysis, the team is happy to report considerable improvements for these languages.

So what are the improvements you can expect when you start using v2 of the TAUS Estimate API? Below are some of the important updates:

  • V2 is highly sensitive to accuracy in translation, including proper names, numbers, and specific details.
  • It significantly penalizes additions, omissions, word swaps  in translations whenever it changes the meaning or makes sentences syntactically incorrect, while still accounting for natural differences in word order between languages.
  • It expects similar punctuation between source and target, with some flexibility for language-specific rules.
  • It can detect and penalize overly formal or informal styles compared to the source text.
  • V2 does a better job at estimating the quality of short sentences.
  • It provides a more spread-out range of scores, making it easier to distinguish between different levels of translation quality.
  • While best for languages it was trained on, it can recognize similar error types in untrained languages (with some limitations) as well.

Examples

V 2.0 notices mistranslation in less obvious cases. Here is a case of polysemy in English, where Spanish has different words depending on context.

Source

Target

Score

To add a cell to a table row, you use the <td> tag.

Para agregar una celda a una fila de la tabla, usas la etiqueta <td>.

0.92

To add a cell to a table row, you use the <td> tag.

Para agregar una celda a una fila de la mesa, usas la etiqueta <td>.

0.67

Often, quality estimations fail to respond to incorrect gradations. Version 2.0 picks up relatively subtle errors (in relation to sentence size and word type).

Source

Target

Score

The Thundercloud Solution for E-Mail includes every option necessary for e-mail storage management designed for medium-sized businesses.

Thundercloud Solution for E-Mail comprend toutes les options nécessaires à la gestion du stockage des e-mails et est conçu pour les grandes entreprises.

0.82

The Thundercloud Solution for E-Mail includes every option necessary for e-mail storage management designed for medium-sized businesses.

Thundercloud Solution for E-Mail comprend toutes les options nécessaires à la gestion du stockage des e-mails et est conçu pour les moyennes entreprises.

0.91

Close-enough translations often go unnoticed for many types of quality estimation, that are also generally known to struggle with short sentences. Version 2.0 makes a clear distinction between 'software' and 'operating system'

Source

Target

Score

hardware and software requirements

Hardware- und Softwareanforderungen

0.94

hardware and software requirements

Hardware- und Betriebssystemanforderungen

0.85

Suggested Thresholds

The quality standard and expectations are of course subjective, so it is up to you and your use case to decide where to draw the line of good and bad quality. However, here are some guidelines from the NLP team to interpret the scores and make decisions when using V2 of the generic model:

  • Scores above 0.9 generally indicate good translations
  • 0.88-0.9 is a gray area (can be good, might have issues)
  • Below 0.88 usually indicates at least minor errors
  • Below 0.8 suggests serious errors
  • Below 0.7 indicates very poor quality

Version 2 of the TAUS Estimate API was released on 8 July. To switch to V2, please follow the instructions here.

Interested to try out the improved generic model? Sign up for a free trial and get access to 500,000 characters in Sandbox mode.

 

Author
anne-maj-van-der-meer

Anne-Maj van der Meer is a marketing professional with over 10 years of experience in event organization and management. She has a BA in English Language and Culture from the University of Amsterdam and a specialization in Creative Writing from Harvard University. Before her position at TAUS, she was a teacher at primary schools in regular as well as special needs education. Anne-Maj started her career at TAUS in 2009 as the first TAUS employee where she became a jack of all trades, taking care of bookkeeping and accounting as well as creating and managing the website and customer services. For the past 5 years, she works in the capacity of Events Director, chief content editor and designer of publications. Anne-Maj has helped in the organization of more than 35 LocWorld conferences, where she takes care of the program for the TAUS track and hosts and moderates these sessions.

Related Articles
17/09/2024
Unlock cost-effective translation workflows using Quality Estimation and Large Language Models for faster, high-quality results and discover how to optimize your translation process.
08/08/2024
Discover the new TAUS Estimate API Demo Interface, designed to quickly assess the quality of machine translations. Save time, reduce costs, and boost productivity by identifying high-quality segments that don't require further editing.
31/01/2024
Find out how companies integrate QE into their workflows and explore real-world use cases and benefits of quality estimation. From mitigating risk in global chat communication to minimizing post-editing in machine translation workflows.