The Sapienza NLP (Natural Language Processing) research group, led by Roberto Navigli, Full Professor at the Department of Computer, Control and Management Engineering “Antonio Ruberti” of Sapienza University of Rome, today announces the release of the Minerva models, a new family of large-scale language models (Large Language Models, LLMs) trained “from scratch” for the Italian language.
Minerva has been developed within FAIR (Future Artificial Intelligence Research), the project led by the National Research Council implementing the national AI strategy through PNRR funding, in collaboration with CINECA, which provided access to the Leonardo supercomputer.
The Minerva models are available from today in preview to the FAIR scientific community, and in the coming weeks they will be released publicly in their most advanced version, including the ability to interact conversationally in Italian.
Minerva represents a significant step forward for AI made in Italy, reinforcing the country’s excellence in generative AI. The project is led by Prof. Roberto Navigli, recipient of two prestigious ERC grants and ACL Fellow, alongside two talented young researchers, Edoardo Barba and Simone Conia.
“The distinctive feature of Minerva models is that they have been built and trained from scratch using open-access texts, unlike existing Italian models, which are based on adapting models such as LLaMA and Mistral, whose training data remain unknown,” explains Roberto Navigli. “Each Minerva model has been trained on a vast collection of documented Italian and English online sources, totaling over 500 billion words—the equivalent of more than 5 million novels. Transparency in model training not only strengthens trust among users, the scientific community, public institutions, and industry, but also fosters continuous improvement and represents a first step toward rigorous verification processes to ensure compliance with laws and regulations.”
With a range of models varying in size and computational capacity, each containing billions of parameters, the Minerva project aims to provide transparent foundations for AI systems applicable across multiple domains, from natural language understanding to text generation, from machine translation to automated customer support. This flexibility makes Minerva a valuable resource for researchers, companies, and developers seeking to leverage artificial intelligence to improve efficiency and interaction.
The initiative sets ambitious goals and aims to reshape the Italian AI landscape. As Giuseppe De Pietro, President of the FAIR Foundation, states: “We are very pleased with this first major milestone achieved within the FAIR scientific community. We strongly believe in the research direction leading to the development of an Italian large language model, and for this reason we are investing PNRR resources in this project, which represents a strategic first step of great importance.”
“This important result, unique in Italy, confirms the scientific excellence of the Department of Computer, Control and Management Engineering (DIAG) at Sapienza, particularly in the field of Artificial Intelligence, where we can count on a large group of researchers of outstanding national and international level,” says Tiziana Catarci, Director of DIAG.
A further innovative aspect of the initiative is the Sapienza NLP group’s commitment to developing new evaluation benchmarks—tools specifically designed to assess the ability of large language models to capture and respect the cultural and linguistic nuances of the Italian language. In addition, the project will release comprehensive technical documentation to share the engineering process and scientific findings, enabling replication of model implementation and training.



