Around the time of the Battle of Plassey in 1757, the form of Bangla we recognise today began to take shape, drawing largely from the dialect spoken in the Nadia region. Over time, this dialect gained prominence and gradually formed the basis of modern standard Bangla.
As the use of AI increases, will Bangla gradually lose some multidimensional expressive capacity ?
One major obstacle for Bangla in this age of AI is the absence of a strong, comprehensive corpus. While English utilises systematic processes to track word frequency and evolution, Bangla lacks an official, systematically updated record.
Historically, people were the primary drivers of the evolution of Bangla. Writers standardised forms and communities collectively decided what sounded “natural.”
But in the 21st century, technology has also been a powerful co-author. Social media shortens expressions. Autocorrect nudges spelling choices. Search engines influence which words people use to find information.
In the rapidly evolving field of artificial intelligence, language representation is a significant challenge. For major global languages, there is an abundance of datasets and resources that enable the development of advanced AI models. However, languages like Bengali, spoken by over 230 million people worldwide, remain underrepresented in AI research.
Now, artificial intelligence (AI) has added a new layer. With the rise of generative AI and large language models (LLMs) such as ChatGPT or Gemini, we can now hold conversations with machines in Bangla. AI has begun to read, write, and even craft creative content in Bangla. We can turn to it with our questions in Bangla and receive instant replies.
For the first time in history, language evolution is partly being steered by machines trained on digital data. As this new voice enters our linguistic world, what will it mean for the language itself—for how we speak, shape, and pass it on? And what of young people, still learning Bangla, yet already turning to LLMs to converse in it?
LLMs like ChatGPT or Gemini operate within a limited set of response patterns, typically around 10 to 20. When someone consults an LLM, or a student seeks educational help from it, their vocabulary and ways of expressing ideas can become confined to these fixed patterns. It can quietly guide how people write. Over time, these micro-influences accumulate,” mentioned Nishat Raihan, an LLM researcher at George Mason University.
“Previously, if 20 different people tackled the same question, they would each bring unique expressions and approaches. Today, that richness of language and diversity of thought is at risk of narrowing, particularly among young people who rely on it heavily for assignments and homework.” ( Daily Star)
Many linguists share similar concerns, viewing these technological shifts through the lens of language use and creativity.
“AI’s language is mechanical, and it is often quite noticeable whether an assignment or a piece of creative writing has been produced with AI. Bangla is a powerful language that can express a single idea in many different ways. As the use of AI increases, Bangla may gradually lose some of this multidimensional expressive capacity,” noted Dr. Tariq Manzoor, Professor in the Department of Bangla, Dhaka University.
In the Oxford or Cambridge dictionaries, you’ll see that they systematically record new words as they are added to the language. In English, they maintain a corpus of the language, which allows them to track even small changes.
By studying word frequencies, we can observe which words are increasing in use, which are declining, and which are disappearing from the language altogether. This is an efficient process to keep track of language changes,” explained Professor Musa.
Drawing on language change and language contact theories, Professor Musa highlighted how semantic changes, borrowing, and code-switching (alternating between languages in conversation) have become increasingly common. “Religion has played a significant role in recent language change. The usage of words related to religion has grown substantially.
Currently, there is no system for officially adding new words, and we do not have any formal corpus. New words appear through newspapers, and we infer their meanings from context.
Without an official corpus, Bangla currently lacks discipline, and many words are used incorrectly in the wrong context.”
Bengali.AI regularly hosts competitions to stimulate innovation and engage the research community. For example, the Bengali Handwritten Digit Recognition Challenge invited participants to build machine-learning models capable of accurately identifying Bengali numerals.
Such competitions provide valuable benchmarks and encourage researchers to push the boundaries of what AI can achieve for underrepresented languages.
Ends
No comments:
Post a Comment