Dear friends,
I hope this finds you well. I am excited to write about some Current AI updates and a deeper dive on our work on linguistic diversity.
Since I last wrote to you, we have been focused both on building Current AI, as well as starting our programming. In practice, this means we have been busy building out our team, from recruiting our first CEO and the members of our inaugural Board, to staffing up with our first permanent team members. Thank you so much to the many of you who shared our opening for the CEO position. We have just opened recruitment for our first Programme Manager and Finance and Operations Manager - please do share the vacancies with anyone in your networks who should apply.
We have been working closely with our country partners to formalise our partnerships, for example establishing our collaboration with the government of Morocco, and have registered as a nonprofit association. On the programming side, we have selected three pilot programmes to start working on - linguistic diversity, health & human welfare, and audit & accountability. Before we officially launch the pilot programmes, we wanted to take the opportunity to do a deep dive on the critical topic of linguistic diversity. For local innovation to achieve local needs, models will need to be trained in local languages.
Linguistic diversity is vital to the success of Public Interest AI. AI systems can help diagnose diseases from medical scans and write code in dozens of programming languages. But ask them to grasp how traditional healers in South Africa classify symptoms, or how Pacific Islander communities navigate climate adaptation, and you will hit a wall.
Despite rapid advancement in technology, most AI systems are trained in English and a handful of other dominant languages, leaving billions of languages without meaningful representation. The world is multilingual, multicultural, and complex. Linguistic diversity is central to building AI that reflects this reality.
To dig deeper into this fascinating topic, I have asked two of our partners, Vidushi Marda from AI Collaborative and Lori McGlinchey from the Ford Foundation, to share their perspectives on why linguistic diversity in AI matters and where Current AI can add value.
Why should we care about linguistic diversity in AI?
Lori: There are more than 7,000 spoken languages and perhaps over 300 distinct signed languages that currently carry and convey humankind’s collective cultural and intellectual heritage. Our languages help shape our worldview. Spoken and signed languages are complex and nuanced; they evolve over time. Yet artificial intelligence is significantly biased towards a handful of dominant languages, with consequences for human flourishing. A significant amount of the data used to train most major AI models comes from the internet. English accounts for a disproportionately large share of this online content, making it the most accessible and abundant language for training.
So for those who believe that AI can truly be positively transformative, they must take seriously the challenge of meaningful linguistic diversity. When the handful of corporations building AI systems prioritize dominant languages, they are excluding entire frameworks that people and communities use to make sense of the world. Right now the dominant AI industry is headed in the direction of the fast food industry - we may get scale, speed and convenience, but we give up meaningful cultural variety.
How bad is the gap?
Vidushi: 96% of languages are spoken by fewer than a million people each, and they are systematically underrepresented in AI infrastructure and applications. Without this basic infrastructure, under-represented languages, and in turn communities, currently remain excluded from engaging with advancements in AI. The result is that for the majority of the world, AI systematically misunderstands context and flattens nuance. Take healthcare, an AI system might completely miss how different cultures express pain or describe symptoms, leading to misdiagnosis. Or in legal contexts, it might fail to understand how indigenous communities structure authority and decision-making. The gap is not just about language, it is also about cultural context and community.
What is the path forward?
Lori: Small is beautiful! There is already a global ecosystem of small language technology developers building infrastructure that enables communities to develop their own datasets, build culturally grounded benchmarks, and maintain control and stewardship over how their languages are represented in AI systems. That is exactly why we have made linguistic diversity our first pilot program. For example, the Huniki Federation is a cooperative venture of African language AI tech startups offering high-quality, multi-language services for speakers of under-resourced African languages and those who are seeking to reach them.
Members of the Huniki Federation are working with and hiring people from their local communities, offering users AI technology that reflects the diversity of African languages. This extends power to smaller language tech startups and supports local economies, while giving people access to higher-quality language tech models. As a federation of organizations, Huniki is able to have one interface to meet user needs while allowing each member's startup to maintain independence while sharing data and resources.
Why now?
Vidushi: Multilingual models often outperform monolingual ones, even on dominant languages like English. We are already seeing this in Africa, where new datasets covering multiple local languages are showing that AI trained this way captures cultural nuance more effectively and delivers stronger results than English-only models. Linguistic diversity is not charity, it is a fundamental requirement for AI systems to work across jurisdictions - it is a competitive advantage. At Current AI, we want to bolster efforts to make AI systems work in the public interest - linguistic diversity is a critical building block to that vision.
The perspectives Lori and Vidushi shared underline why we have chosen linguistic diversity as our first pilot programme, and how it connects to the broader work we are building at Current AI. I am looking forward to sharing more with you in due course.
Thanks for following along.
Martin Tisné
