DOTTORATO DI RICERCA IN INGEGNERIA DELL'INFORMAZIONE

Foto 7

Info for...

Login

Baglioni M., Mannocci A., Pavone G., De Bonis M., Manghi P.: "(Semi) automated disambiguation of scholarly repositories",

Written by MICHELE DE BONIS

The full exploitation of scholarly repositories is pivotal in modern Open Science, and scholarly repository registries are kingpins in enabling researchers and research infrastructures to list and search for suitable repositories. However, since multiple registries exist, repository managers are keen on registering multiple times the repositories they manage to maximise their traction and visibility across different research communities, disciplines, and applications. These multiple registrations ultimately lead to information fragmentation and redundancy on the one hand and, on the other, force registries’ users to juggle multiple registries, profiles and identifiers describing the same repository. Such problems are known to registries, which claim equivalence between repository profiles whenever possible by cross-referencing their identifiers across different registries. However, as we will see, this “claim set” is far from complete and, therefore, many replicas slip under the radar, possibly creating problems downstream. In this work, we combine such claims to create duplicate sets and extend them with the results of an automated clustering algorithm run over repository metadata descriptions. Then we manually validate our results to produce an “as accurate as possible” de-duplicated dataset of scholarly repositories.

Keywords: Scholarly Registries, Scholarly Repositories, De-duplication, Open Science

File: link to the article

Published in Elenco Pubblicazioni - Publications

News

Subscribe to this RSS feed

Tel +39 050 2217511
PEC: This email address is being protected from spambots. You need JavaScript enabled to view it.

Dipartimento di Ingegneria dell'Informazione
P.I. 00286820501 - C.F. 80003670504

email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Via G. Caruso - 56122 - Pisa

DOTTORATO DI RICERCA IN INGEGNERIA DELL'INFORMAZIONE

Links

Info for...

Login

Baglioni M., Mannocci A., Pavone G., De Bonis M., Manghi P.: "(Semi) automated disambiguation of scholarly repositories",

News