SoFAIR: CORE to Coordinate New Project to Facilitate the Reproducibility of research studies

We are pleased to announce that the Open University has just been awarded a new research grant in the international CHISTERA Open Research Data & Software Call which aims to enhance the discoverability and reusability of open research software.


Open research software and data are pivotal for scientific innovation and transparency, but are often not cited as first-class bibliographic records. Much of these software mentions therefore remain concealed within the text of research papers, hampering their discoverability, attribution, and reuse. This, in turn, makes it harder to reproduce research studies.


The SoFAIR project (from Making Software FAIR) aims to address this critical issue by enhancing the management of the research software lifecycle and ensuring research software and data adhere to the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. The project will build on the existing capabilities of the open scholarly infrastructures operated by the project partners. SoFAIR is a Euro 499k international project coordinated by (1) The Open University in partnership with (2) INRIA, France; (3) Brno University of Technology, Czech Republic; (4) the Polish Academy of Sciences (PAN), Poland; and (5) The European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI), United Kingdom. SoFAIR is funded under the 2022 CHIST-ERA Open and Reusable Research Data and Software (ORD) call.


The project will build on the existing capabilities of the open scholarly infrastructures operated by the project partners, specifically the CORE aggregator (operated by The Open University), the HAL repository and the Software Heritage Foundation archive operated by INRIA and Europe PMC operated by EMBL-EBI. The project’s overarching goal is a comprehensive solution consisting of several interlinked components encompassing machine learning-assisted identification of research software within scholarly manuscripts, validation by authors and subsequent registration with persistent identifiers (PIDs), culminating in permanent archival.


The project’s overall approach is fundamentally rooted in the idea of extending, adapting and linking existing open scholarly infrastructures, including (a) CORE, leveraging its extensive corpus of scholarly literature and its position as a nexus for the global open repositories network, (b) The Software Heritage Foundation universal software archive, and (c) open source tools (GROBID / Softcite) using existing open protocols to support the entire research software assets lifecycle.


Validation of the developed tools and workflow will occur through three distinct use cases supported by project partners PAN, INRIA (HAL) and EMBL-EBI: (1) a life sciences demonstrator, (2) a multi-disciplinary demonstrator for institutional repositories, and (3) a digital humanities case study. These diverse applications underscore the project’s commitment to versatility and comprehensive coverage SoFAIR will deliver a step-change in the way research software is identified, validated, registered and archived for the long term. From a broader perspective, SoFAIR aligns with the FAIR agenda and Horizon Europe initiatives on open research data and software. The project will foster integration with European infrastructures such as EOSC, CLARIN, and DARIAH, ensuring robust connectivity across the research landscape.


Professor Petr Knoth, Project Coordinator and Head of CORE said: “SoFAIR will demonstrate the significant value that can be achieved by open scholarly infrastructures working in synergy to deliver a new AI-supported solution for identifying, discovering and archiving research software mentioned in research manuscripts. This will constitute an important contribution towards increasing reproducibility of research studies and making research assets FAIR.


Harith Alani, KMi director added “As the director of KMi, I am proud to see our CORE technology and AI expertise converge in SoFAIR, exemplifying the commitment of KMi and The Open University to open research software. SoFAIR will foster greater reproducibility and accessibility in research, paving the way for a more transparent and inclusive academic landscape.


In essence, SoFAIR is paving the way for a transformative shift in how research software is discovered, accessed, and referenced. By increasing the number of research software assets that can be registered with PIDs, and thus made citable, the project will not just facilitate software reuse and reproducibility; it can incentivise the creation of innovative research software, thereby contributing to the broader scientific community’s collective progress.

KMi News Image 0
KMi News Image 1