Introduction
This article highlights key insights from decades of work on integrating Latvia’s indigenous Livonian language into the digital space. Livonian represents the future of many indigenous and endangered languages. It is a small community of approximately 1,000–2,000 people, with fewer than 20 fluent speakers remaining. The community is extraterritorial, scattered, and has largely lost its language over time. Moreover, Livonian is relatively low in resources.
The digital world may offer a solution, serving as a form of “prosthetics” for what has been lost—shared spaces, language proficiency, and functional domains, as well as access to language sources. All of these areas are particularly crucial for language acquisition, use, and sustainability. The Livonian experience offers valuable lessons on effectively bringing indigenous languages into the digital world and step by step getting closer to achieving digital equality.
Building Digital Foundations
Digital inclusion for indigenous languages begins with small but essential steps. Success depends not on technology itself—since it constantly evolves—but on the ability to produce usable data. Data production is the foundation of all forms of language documentation and digital activity. Basic tools like keyboard drivers, accessible symbols, and standardized orthographies are vital. Inclusiveness of standardization is especially important to preserve linguistic variety and ensure broad community involvement.
The involvement of the community cannot be overstated. Community members are not only sources for documentation but are also essential contributors to building digital resources. At the UL Livonian Institute, we actively involve speakers in our work, such as transcribing texts, annotating corpora, providing voice samples for dictionaries and teaching the language to others. This dual approach benefits both research and the speakers, allowing them to immerse themselves in a Livonian environment, thereby keeping their language proficiency active.
Choosing the Right Technology
Selecting appropriate technologies is another aspect crucial for low-resource languages with extremely limited number of speakers. Some technologies are resource-intensive and may not yield significant benefits for language sustainability. For example, while Optical Character Recognition (OCR) is an exciting tool, it is impractical for Livonian due to the poor quality, handwritten nature and extensive variation in dialects and ortographies of much of its documentation. In such cases, manual transcription appears to be faster and even more efficient.
Machine translation is another area of experimentation for Livonian. Although we attempted to develop Livonian machine translation, a lack of data meant that producing accurate Livonian texts was infeasible. More importantly, the community has a greater need for tools that provide access to Livonian content for non-proficient speakers. Machine translation from Livonian to languages like Latvian would be and is becoming far more beneficial, allowing the community to access Livonian texts and engage with the language, culture and heritage, especially when community members are not anymore proficient enough in language.
Responsibility in Technology Development
Both academia and technology developers share a responsibility to ensure that their work benefits the communities they serve. Research and technological innovation must always address the question: how will this be accessible and useful to the community? Unfortunately, this consideration is often overlooked. For instance, Google recently published a Livonian keyboard driver for Android phones without consulting the community. As a result, the keyboard does not fully meet the principles of Livonian orthography and community’s needs, rendering it usable with limitations.
Historical examples reinforce this point. A Livonian-German dictionary published in the 1930s used phonetic transcription and German correspondences, which were inaccessible to Livonian speakers due to the lack of proficiency in German and ability to read phonetical symbols. It took over 50 years for a first dictionary in Livonian orthoraphy to be published. These lessons underline the importance of designing resources that serve the community effectively from the outset.
Avoiding Digital Pollution
It should also be kept in mind, that when poorly executed technologies are introduced, they can create “digital pollution,” where low-quality content dominates digital spaces and undermines language use. For example, poorly implemented translation tools or chatbots may generate enourmous amount of inadequate texts and overwhelm all language digital domains, eroding trust in the language’s digital presence.
This could effectively harm, rather than support, the language’s survival not only in digital domains, but as a whole.
Dreams vs. Fantasies
Finally, a critical takeaway from our work is the importance of pursuing achievable dreams rather than fantasies. Fantasies are what they are – something that never ment to be real, and chasing them may even be dangerous. While some technologies may appear appealing, they can exhaust community resources without delivering tangible benefits. Dreams, on the other hand, are grounded in reality and have the potential to be realized. At the UL Livonian Institute and the Ad-hoc Working Group on Digital Equality and Domains, we believe that achieving equality and sustainability for indigenous languages in the digital world is not only a dream but a necessity.
The digital world is crucial for the future prosperity of indigenous languages. Ensuring their inclusion in this space will contribute to global linguistic diversity and cultural richness. By prioritizing thoughtful and community-centered approaches, we can work toward a more inclusive digital future.
Conclusion
The Livonian case provides valuable lessons for other endangered languages entering the digital age. By focusing on community involvement, appropriate technologies, and responsible development, we can create sustainable digital environments that empower speakers, ensure sustainability of languages, and revitalize them. Digital inclusion is not just about technology—it is about building a future where all languages, no matter how small, can thrive.
Speech at the International Translation Day 2024 during the online event “Translation, an art worth protecting: Moral and Material rights for Indigenous Languages” organized by the International Decade of Indigenous Languages and Translation Commons.