description: Cross Lingual Dictionary Generation Tool Using Phonetic Similarity for Natural Language Translation of Gujarati to Hindi.
python, NLP
19ce098 : Heer Patel 19ce119 : Hemit Rana 19ce121 : Shruti Rana 20ce114 : Yagnik Poshiya 20ce132 : Nisarg Shah 20ce155 : Adnan Vahora
Dr. Ashwin Makwana
Natural Language Processing, Machine Translation, Linguistic Resources, Corpus, Phonology, and Hamming Distance.
Machine translation is an important technology for language translation, and is particularly relevant in a linguistically diverse country like India. Machine Translation requires a deep and rich understanding of the source language and the input text, and sophisticated, poetic and creative command of the target language. The market is largest for translation from English to Indian Language, primarily Hindi. Hence it is no surprise that a majority of the Indian Machine Translation (MT) systems are for English-Hindi translation, but there is also a demand for Indian language to Indian language translation. There is a great demand for Hindi to Gujarati translation.
As is well known, Natural Language processing presents many challenges, of which the biggest is the inherent ambiguity of natural language. Any Machine Translation systems have to deal with ambiguity, and various other Natural Language phenomenon. In addition, the linguistic diversity between the source and target language makes MT a bigger challenge. The problem of automatically producing a high-quality translation of an arbitrary text from one language to another is thus far too hard to automate completely. But certain simpler translation tasks can be addressed with current computational model. So here we are taking “Limited Domain”, this domain has a limited vocabulary and only a few basic phrase types.
The technology behind developing a machine translation system is not so simple. Merely replacing source language words with target language words cannot produce a good machine translation system. A word for word translation does not exactly produce a very satisfying target language text. A good machine translation system must incorporate not only a good knowledge of the vocabulary of both the source and target language, but also of their grammar.