Finetune a model in your local dialect

Text-to-speech and transcription models are getting very good, but they still have some ways to go with local accents, dialects etc. While there are some private efforts towards this goal, open-source models always pushes the frontier forward and will make the benefits of AI more and more accessible to all.

The Task

Pick a local dialect and benchmark the top open-source text-to-speech model or transcription model on that dialect. This first involves creating the benchmark. This is pretty straightforward for transcription, where you can create a set of recordings and correct transcription and then assess correctness. For text to speech, benchmarking is human assessed, and more subjective - look up existing approaches!
Local dialects is open to your interpretation. It can be something like “Madras Tamil” or “Champaran-style Hindi” or even hyper specific (“Telugu as spoken in my native village”). It can also be languages that currently perform poorly on benchmarks. Whatever you pick, make sure you have a reasonable way to collect samples to train your model.
Publish your model on hugging face and open source your training code on Github.

Bonus Points

Make a quick demo application to show off your work! This can be as simple as a Gradio application on Hugging Face
Think deeply about benchmarking and what nuances it needs to capture for your particular use case and publish these thoughts as a blog post!

Why it Matters:

Lots of Indians access the internet through speech (voice notes, videos etc), and these models are how speech will be generated and interpreted going forward. The future of apps will likely be much more voice based, and therefore the capacity of many Indians to access and enjoy this progress depends on these pieces (TTS and transcription) working well. Many models are now also being trained to be natively multi-modal and this work can be valuable training data for such efforts.

What's in it for you:

While big labs are making progress, your specific dialect might not be their top priority. But it can be yours! And in making your work open-source, labs and companies can use your work and serve people better in the dialect of your choosing—that’s an empowering approach to making sure the people in your community from backgrounds like yours are included!

We will publicize and distribute excellent work through our channels so it gets noticed. We'll also get you final round interviews at top AI companies, if that's something you want.