Today, especially with powerful developments in the field of artificial intelligence (AI) and numerous daily technological applications, the digital world is no longer just an opportunity or a challenge—it is the new reality.
Technology no longer resides solely in computers and smartphones; it is present in refrigerators, grocery stores, cars, and boats. Sooner or later, the moment will come when every single indigenous language on Earth will have to engage with this reality. This means that everyone must be prepared, all necessary tools must be in place, and no doors should be closed to any indigenous language seeking to enter and thrive in digital domains.
*
In March 2025, the First Global Survey of Indigenous Languages, jointly drafted by the Ad-Hoc Working Groups of the Global Task Force for Making a Decade of Action for Indigenous Languages, will be launched. This survey marks the first global effort to assess the current state of indigenous languages worldwide.
The survey covers several thematic areas, including language acquisition, transmission, and legal status. One key section is dedicated particularly to the digital world. For the first time, data acquired through this survey will enable the mapping of indigenous languages in digital domains, gathering insights on research, digital literacy, data production capabilities, resources, technologies, AI applications, data sovereignty, and many other critical topics.
It is designed not only to collect information on the current state of indigenous languages in the digital sphere but also to reflect the technological needs expressed by indigenous communities themselves.
In our Ad-Hoc group, we hope that this survey will become a powerful tool for the development of data-driven policies and approaches for the digital future of indigenous languages. Furthermore, by being repeatable, it can serve as an effective instrument for tracking progress in ensuring equality for indigenous languages in digital domains.
*
During the development of the survey, one key aspect became increasingly clear: digital equality, presence, and competitiveness of indigenous languages in digital domains start with very basic issues. If these issues remain unresolved, any benefits or hopes indigenous languages might have from digital domains, technology, or AI solutions could be lost.
These issues revolve around two fundamental elements:
- The ability to produce digital content or data in indigenous languages.
- The ability to use indigenous languages in digital domains and technological tools.
If language has limited number of speakers, does not have it’s symbols available for digital use, or even lacks a writing system; if it does not have simple tools like keyboard drivers or access to digital devices – it’s capabilities of digital data production and therefor ability to be used in any digital domains are greatly suspended.
This lack of data also makes potential development of any language technologies, especially those requiring large data amounts like large language models, or adaptation of other digital technologies for the use of such languages nearly impossible.
Conversely, even if an indigenous language has relatively well-established digital language data collections, has developed some technologies, and is fortunate enough to have its own resources to expand further—like my own language, Livonian—it often encounters another major issue: the inability to use these technologies in everyday digital tools.
- You cannot type a message in your language on your smartphone.
- Your favourite text editor or web browser does not allow you to use spellcheck tools you have developed.
- Your favorite social media platform does not support your language in its language settings, and removes your reels as it does not understand the language used in them.
- You cannot register the name of your NGO or name of your homestead in your own language because government-run databases do not allow to use your special characters.
Suddenly, simple benefits that speakers of major languages take for granted become insurmountable obstacles for you, the speaker of a language that is exluded from the list. Instead of planning which digital capacities to develop next for your language, you find yourself figuring out how to work around problems that should not exist in the first place.
The digital world, which should be full of opportunities for growth, instead becomes a new battleground, where indigenous languages face unequal competition against dominant world languages. In this competition, indigenous languages must work harder than others just to maintain their presence.
*
To ensure the inclusion of indigenous languages in the digital world, we must first address their exclusion. To achieve equality, we need to eliminate the most fundamental inequalities.
Several great initiatives are already tackling these issues:
- The Open Language Initiative, proposed by the indigenous Sámi language technology hub Giellatekno.
- Meta (Facebook) has made strides in localizing its platform for some indigenous languages and launched initiatives to democratize language technology.
- Google has worked on including indigenous languages in some of its products.
But we need more.
We urgently need a policy document or legal act at the highest possible level to motivate all key players—academia, industry, governments, and society at large—to act responsibly and to make sure that:
- Favorable conditions and basic needs for producing digital content and data in indigenous languages are met.
- All obstacles restricting the use of indigenous languages in digital domains are removed.
For academia, this means contributing to the languages and communities they research by listening to them and assisting with establishing inclusive orthographies, assisting with applied research in grammar, documentation and digital data creation, and developing new language technologies and approaches specific for needs of indigenous languages.
For industry, this means ensuring that all digital technology—current and future—can be localized into indigenous languages. This could be achieved by the original developer, third-party developers, or indigenous language speakers themselves. This would also demonstrate that we – indigenous people – are considered customers as well.
For governments, this means allocating targeted funding or prioritizing it to support indigenous languages entering digital domains, and removing all technical barriers that restrict their digital use.
For society at large, it means supporting and demanding these actions. It means understanding that, regardless of the number of speakers, every language matters.
Because, in the end, this goes beyond indigenous languages and the rights of indigenous peoples. It touches on the core values of the Universal Declaration of Human Rights—the fundamental right of every human being to freedom of speech, opinion, and expression.
For how can we speak and express ourselves in the digital world if we cannot use our own language?
* Speech held at the conference “Language Technologies for all (LT4ALL)”, Paris, Unesco. 24.03.–26.02.2025. Participation has been supported by theEuropean Commission / ECG Recovery and Resilience Facility project “Internal and External Consolidation of the University of Latvia” (No.5.2.1.1.i.0/2/24/I/CFLA/007) ; sub-project “Improving access to a critically under-resourced language: AI-based approaches for producing and obtaining Livonian content” (LU-BA-PA-2024/1-0056);