Neural machine translation based word transduction mechanisms for Low Resource Languages

The work aims at alleviating the loss of translation quality arising due to the frequent occurrence of Out-of-Vocabulary (OOV) words during machine translation of low-resource languages (LRLs). We propose a novel word-to-character embedding mapping algorithm and apply these upon three variants of attention-based seq2seq models to perform transduction of such words from Hindi to Bhojpuri…

MTDMA: Multi-task Deep Morphological Analyzer

The project aims at predicting: Parts-of-speech (POS), Gender (G), Number (N), Person (P), Case (C), Tense-aspect-mood (TAM) marker as well as the Lemma (L) or roots of words occurring in Hindustani texts (viz. Hindi and Urdu), by sharing the knowledge learned while capturing the representation of each of these in a multi-task learning…