Leveraging Loanword Constraints for Improving Machine Translation in Low-resource Settings
Dec 2, 2025•Channel
AI Analysis
Data from YouTube Data API v3•Updated Just now
Video Overview
Video Details
Published6 months ago
Duration45:03
Video IDpqClPCtIvQ0
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video
Performance Metrics
Views177
Likes8
Comments0
Engagement Rate4.52%
Likes per 100 views4.52
Comments per 1K views0.00
Description
Translating from high-resource to low-resource languages like Emakhuwa remains a challenge due to limited parallel data, orthographic variation, and frequent loanwords and code-switching. In this talk Felermino will discuss how to apply lexicon-guided neural machine translation, integrating bilingual dictionaries, and loanword mappings into the training process to address this challenge.
Our method uses over 8,000 dictionary entries and 12,000 loanword mappings to build sentence-specific glossaries incorporated via input augmentation. Experiments on FLORES+ show improved lexical coverage, reduced inconsistencies, and more contextual accurate translations. Suggesting a promising direction for low-resource MT by bridging data scarcity and vocabulary gaps through structured lexical integration.
Learn more about Microsoft Research Lab – Africa, Nairobi: https://www.microsoft.com/en-us/research/lab/microsoft-research-lab-africa-nairobi/seminars/