icon epic arrow
Webinar_Bg
icon epic arrow

Data Cleaning 101

23 February 2021
5:00 - 6:00 pm CEST

Every company ‘sits’ on a mountain of language data in translation memories and content management systems. But that data are locked up in legacy formats and templates that make them not very useful and accessible in the modern scenarios of machine translation...

icon epic arrow

Agenda

  • Problems in data (why cleaning is required)

  • The available tools and their limitations

  • Cleaning based on sentence embeddings (Laser, LaBSE)

  • Comparison with examples