- Creating a computationally usable speech->text system for Kyrgyz.
- Using bootstrapping methods with overlapping known systems to develop speech->text programs for underresourced languages.
Developing any kind of human language technology (HLT) typically requires a large amount of data from spoken and/or written corpora. However, for most of the world’s languages these resources are not currently available.
Creating these resources is a time-consuming and expensive endeavor, requiring research assistants who are both natively fluent in the target language and have some technical training.
In particular, the resources required to collect, transcribe, and align a speech corpus is a major hindrance to developing useful HLT such as automatic speech recognition (ASR).
Despite the challenges, there is a real need to develop such technologies for the world’s languages. From a relief worker’s point of view, being able to communicate with speakers of another language is a major asset in times of natural disaster.
From a consumer’s perspective, applications such as Siri and Google Translate make products more attractive. There is a demand for HLT for under-resourced languages, and research in this field attempts to develop techniques which make training and development faster, better, and more efficient.
My research uses minimal data from the Kyrgyz language to develop a functional ASR system, investigating the effects of incorporating linguistic knowledge in the training process. The Kyrgyz language is a Turkic language spoken by approximately 4.5 million people in Kyrgyzstan, Kazakhstan, Uzbekistan, Tajikistan, Afghanistan, Russia, and China. Google Translate began supporting translation for Kyrgyz as of February 17, 2016, but there are currently no existing speech-to-text technologies.