Using AI to preserve cultural heritage

Historic sites, crafts, storytelling and languages are all examples of ‘cultural heritage’. They are the legacies of generations, tangible and intangible, that we inherit from the past, value in the present and pass on to future generations. Some of them have withstood the test of time for millennia – and may do for many more.

Digital technologies are now deeply embedded in how we understand, navigate and experience the world, with artificial intelligence increasingly influencing everyday systems and decisions. Long before the rise of generative AI, many AI applications operated ‘quietly’ in the background, focused on prediction, optimisation and automation. Today, AI is far more visible and expansive, with a growing range of technologies – spanning language, images, data and decision-making – interwoven into daily life and firmly established in public and professional discourse.

In the realm of cultural heritage, AI has the potential to offer powerful tools that safeguard legacies and traditions and foster understanding across communities. However, ethical questions must be addressed if we are to minimise the risk that comes with using these tools. For example, who decides what is preserved or left behind, who provides and owns the data, and what does provenance mean in relation to large language models?

Additionally, as AI collects and categorises data, we are already seeing the power it wields to shape cultural narratives; to decide which stories are told and which voices are amplified. Bias is embedded in many AI systems and there is a very real risk that marginalised voices are left behind.

At a Digital Leaders AI Week Summit, I gave a talk that explored responsible approaches to digitisation, translation and immersive storytelling to preserve cultural heritage, focusing on three core themes: the importance of inclusive and ethical datasets; designing AI in ways that foster inclusivity and empathy, and the potential for AI to connect people through shared stories.

Title slide of presentation, with Dr Jo Morrison to the right.

Watch the presentation in full on YouTube.

This blog reflects on that session and expands the discussion, exploring how the cultural heritage sector can engage with AI in a purposeful, small-scale and thoughtful way, thus building confidence, capability and impact over time.

The need for inclusive and ethical datasets

Technology is never neutral, it is shaped by those who create and distribute it. Every algorithm, piece of code and digital tool holds the creator’s choices, their values and, by proxy, their bias. Sometimes choices are deliberate, other times, they are a reflection of unconscious bias rather than the needs of the communities they should serve.

If we focus on the area of language, datasets used to train large language models that exclude minority languages are considered biased because they fail to represent the full diversity of human communication. As a result, AI systems trained on these incomplete datasets, such as OpenAI’s ChatGPT, tend to perform well in dominant languages like English or Spanish, while struggling or failing entirely with minority languages such as Ulster Scots, Cornish and Manx. This imbalance privileges already powerful groups, while further marginalising communities whose languages and cultural knowledge are least visible in large-scale training data.

Cornish flag in the breeze, with beach, hills, fields and blue sky behind.

Photo: via Runesol

To put this problem into perspective: of the approximately 6,700 languages around the world, UNESCO estimates 40% are in danger of disappearing, the majority being Indigenous. It has been declared a “critical situation” by the UN, which announced 2022-2032 would be the International Decade of Indigenous Languages in a bid to raise global awareness.

Te Hiku Media is one such organisation working to preserve Indigenous languages, and has been committed to revitalising the Māori language through their tribal radio station since the 1990s. Over the past three decades, the charitable media organisation has amassed a large audio repository of people speaking Te Reo Māori, the language of the Māori people, which led to the development of a bespoke AI Te Reo speech recognition model – resulting in 92% accuracy.

Fundamentally, the project was undertaken with a culture of care, accountability and respect. The Māori community and their elders gave permission and retained sovereignty over their data, and have also created their own small data centre with NVIDIA Graphics Processing Units, which can handle complex calculations at high speed.

Benchmarking has played a pivotal role in helping to create fully inclusive datasets, which Suzanne Duncan of Te Hiku Media addresses in the paper, ‘Fit for our purpose, not yours: Benchmark for a low-resource indigenous language’. Essentially, it explains why most modern benchmarks that exist for natural language processing do not reflect Indigenous cultures, and the need for new benchmarks that support decision-making towards developing relevant tools to the community.

Ultimately, Te Hiku Media hopes the project will set a model for other Indigenous communities to create language tools that preserve and promote more Indigenous languages. It is an inspiring model that I hope will inform many others.

A group of Te Hiku media contributors and family members smiling and using laptops.

Photo: Te Hiku Media / Nvidia

Designing AI to promote inclusivity and empathy

Building AI technologies that empower Indigenous communities, safeguard their knowledge and strengthen their ability to thrive in a digital world is fundamental to creating fair and equal societies for future generations.

In 2024, Canadian startup wâsikan kisewâtisiwin was named a Solver Team by MIT Solve (an initiative of the Massachusetts Institute of Technology), marking an important milestone in its work to use technology in service of community. The organisation partnered with the Alberta Machine Intelligence Institute to develop two artificial intelligence tools with the goal of preventing further harm to First Peoples living in Canada.

The first tool helps to monitor hate speech and bias towards Indigenous Peoples on social media, and the other provides users with factual information about Indigenous Peoples and corrects bias and racism detected in written materials. These tools aim to remove the emotional labour from First Peoples, who regularly take on the responsibility of educating Canadians about Indigenous People, their traditional ways or the complex history between Indigenous Peoples and colonised governments and systems.

“Indigenous People must be involved in the development of artificial intelligence – it’s quickly becoming a critical piece of infrastructure, despite already being known to perpetuate harmful bias. It needs our perspective” said Shani Gwin, Founder and CEO of wâsikan kisewâtisiwin.

How AI can connect people through storytelling

Storytelling is an incredibly powerful way to build empathy and understanding between different cultures and communities, and technology provides us with a means to bridge those gaps.

i.Detroit in the US is a great example of this in practice. Developed by artist Marcus Lyon over a three year period, and grounded in a six-month community-led nomination process, the project celebrates 100 citizens making significant contributions to the city and beyond.

A hand with phone, scanning a book with text and portrait. Scanning the portrait activates an audio recording of an interview with the photo's subject.

Photo: Marcus Lyon

Through photographic portraits, app-based oral histories and ancestral DNA mapping, the project offers a deeper understanding of Detroit’s identity, inviting reflection on our shared roles within communities. As part of the digital experience, Calvium created an audio experience that integrates with the visual elements of the book and exhibition. For instance, when a user scans a portrait with their phone, the voice of the subject tells their story in their own words.

Machine learning was used to train the app to recognise each portrait. Our developers did this by feeding the model thousands of image variations of every artwork, allowing it to learn how the portraits might appear under different real-world conditions, such as being scanned from various angles or in changing light. When a visitor scans a portrait, the image is processed by the trained model, which accurately identifies the artwork and triggers the corresponding soundtrack. Through this approach, the system achieved 99% recognition accuracy, delivering a smooth and immersive experience for users.

Returning to Indigenous communities, we can see how a project like iDetroit could be used to amplify untold stories, underrepresented voices and support deeper understanding of distinctive cultural heritage. AI can be used purposefully, to help us break down language barriers at both a community and global level, creating a shared human experience.

Final thoughts

Preserving the world’s rich and diverse cultural heritage is both a responsibility and an opportunity. When used thoughtfully, ethically and creatively, as we have seen, AI offers powerful tools to help safeguard and share the cultures that have shaped communities across generations. By embedding ethics from the outset and designing inclusively, in close collaboration with the communities these technologies are meant to serve, we can ensure that AI technologies support rather than diminish cultural identity. With these foundations in place, digital innovation can bring cultural heritage to life in new ways, strengthening connection, deepening understanding and fostering empathy.

To sum up with a quote from Larry Swallye, a cultural historian from the Lakota tribe: “The language, the whole culture of the Lakota, comes from the song of our heartbeat. It’s not something that can quickly be put into words. It’s a feeling, it’s a prayer, it’s a thought, it’s an emotion – all of these things are in the language.”

hello@calvium.com

+44 (0) 117 226 2000

Subscribe to the monthly Calvium newsletter to get more insight and inspiration like this in your inbox.