KMi Researchers participated in a case study for an ADBU’s Text and Data Mining in Research Report

The past month the French Association of Directors and Officers of University Libraries and Documentation (ADBU) released a report entitled “Text and Data Mining in Higher Education and Public Research”, which mainly explores the UK and French copyright exceptions for text and data mining (TDM). In more detail, the report lists the benefits of text and data mining in scientific research, defines the primary threats in the adoption and practice of TDM, i.e. legal and technical, presents the need for the development of a technical infrastructure, and demonstrates the motivation barriers and the necessary developments in the field.

 

In an effort to understand the level of the TDM adoption and the lack of thereof, the report presents various case studies, one of which is the CORE project. CORE, an aggregation service currently holding around 4.5 million of full-text and 66 million metadata records, has been providing infrastructure for TDM via its main services, namely the CORE API and the CORE Datasets. As the report puts it: “Text-mining at scale cannot take place without infrastructure. Investment is needed in the technologies used to aggregate, normalise, interrogate and preserve TDM materials”. CORE’s services offer open access content and are provided to everyone free of cost. In addition, CORE is participating at the EU-funded project OpenMinTeD, which aims to create a TDM infrastructure, focusing on legal, technical, policy and interoperability issues, while its role is to act as an open access scientific content provider.

Additional to the technical challenges, there are also legal requirements that are creating obstacles and limit the incentives to TDM. Even though there have been amendments both in the UK and the French copyright law, there are still gray areas that prohibit the application of TDM practices among researchers. Furthermore, the legal framework is not harmonised in all countries, while in some of them it does not even exist. The report states that “changes to copyright law must be accompanied by improvements in access, infrastructure, skills and incentives for TDM”. In that context, and while CORE is already technically participating in the promotion of TDM, it welcomes all efforts for the advancement of TDM and is open to provide assistantship with the development of new and improvement of existing policies based on its own TDM experience.

 

Related Links: