Data for AI
Train your LLMs and MT engines with high-quality domain-specific multilingual corpora, carefully curated by TAUS data experts. Explore our offering below or get in touch to obtain the data you need.
Grow your model with TAUS quality data
TAUS offers a core collection of 7.4 billion words (483 language pairs) high-quality multilingual training data at very attractive prices to developers of AI models, LLMs and MT engines.
Download Data Catalog & Pricing