This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Helsinki NLP
helsinki-nlp.bsky.social
did:plc:jjct7xojij4irkp7a36w5iz7
📢 [**mala-opus-dedup-2410**](https://huggingface.co/datasets/MaLA-LM/mala-opus-dedup-2410) 🎉
Part of the [**MaLA Corpus**](https://huggingface.co/collections/MaLA-LM/mala-corpus-66e05127641a51de34d39529), deduplicated dataset from [OPUS](opus.nlpl.eu) (cutoff Oct 2024) features **16,829 language pairs** with deduplication, normalization, and noise filtering
2025-05-18T11:00:32.848Z