UK-LLM, NVIDIA Unveil Welsh-Reasoning AI Trained on Isambard-AI

Backed by government supercomputing with linguistic validation, the open release targets public services.

Overview

The bilingual model builds on NVIDIA’s Nemotron family, with Llama Nemotron Super (49B) and Nemotron Nano (9B) post-trained to reason in Welsh.
To create sufficient data, the team translated more than 30 million entries using NVIDIA NIM microservices with gpt-oss-120b and DeepSeek-R1.
Training ran on the government-backed Isambard-AI supercomputer using DGX Cloud Lepton and hundreds of GH200 Grace Hopper Superchips.
Bangor University’s Canolfan Bedwyr, led by senior terminologist Gruffudd Prys, verified machine-translated data and evaluated Welsh-specific grammar and usage.
The model and Welsh datasets are slated for open availability to enterprise and public-sector users via providers including Nscale, with plans to extend the approach to other UK and international minority languages.