Altmetric
Interoperability for Data Repositories. Machine Methods for Retrieving Data for Display or Mining Utilising Persistent (data-DOI) Identifiers
Title: | Interoperability for Data Repositories. Machine Methods for Retrieving Data for Display or Mining Utilising Persistent (data-DOI) Identifiers |
Authors: | Rzepa, HS Mason, N Mclean, A Harvey, M |
Item Type: | Dataset |
Abstract: | Use of a persistent identifier for access to journal articles (the DOI) is now almost universal amongst researchers. It directs to the journal landing page where the human has to then take over navigation (or payment). Recently, the deposition of data into open access repositories and the resulting assignment of a data-DOI to the data or fileset has started to be increasingly adopted, and in the near future probably mandated by funders. Unfortunately, mechanisms for the retrieval and application of the data from such sources are still inherited from those developed for journal articles. We argue these mechanisms are not fit for (data) purpose. In these three demonstrations, we show how existing standards can be used to automate the data retrieval process, starting purely from the DOI assigned to the objects. The first of these utilises the 10320/loc method (see doi:10.1021/ci500302p) which is flexible and efficient, but is not supported by the DataCite registry. The next two schemes were developed to achieve such interoperability, the first using the DataCite Media API and the second exploiting added metadata such as relatedMetadataScheme = ORE to use the repository ORE resource map. We have embedded these methods into a Javascript-based data viewing demonstrator (JSmol), which is designed to display molecular information. Handlers for other types of data could be readily incorporated, and the system could also be exploited for data-mining. Examples of recently published journal articles which use such data-DOI handling will be cited. Use of a persistent identifier for access to journal articles (the DOI) is now almost universal amongst researchers. It directs to the journal landing page where the human has to then take over navigation (or payment). Recently, the deposition of data into open access repositories and the resulting assignment of a data-DOI to the data or fileset has started to be increasingly adopted, and in the near future probably mandated by funders. Unfortunately, mechanisms for the retrieval and application of the data from such sources are still inherited from those developed for journal articles. We argue these mechanisms are not fit for (data) purpose. In these three demonstrations, we show how existing standards can be used to automate the data retrieval process, starting purely from the DOI assigned to the objects. The first of these utilises the 10320/loc method (see doi:10.1021/ci500302p) which is flexible and efficient, but is not supported by the DataCite registry. The next two schemes were developed to achieve such interoperability, the first using the DataCite Media API and the second exploiting added metadata such as relatedMetadataScheme = ORE to use the repository ORE resource map. We have embedded these methods into a Javascript-based data viewing demonstrator (JSmol), which is designed to display molecular information. Handlers for other types of data could be readily incorporated, and the system could also be exploited for data-mining. Examples of recently published journal articles which use such data-DOI handling will be cited. Use of a persistent identifier for access to journal articles (the DOI) is now almost universal amongst researchers. It directs to the journal landing page where the human has to then take over navigation (or payment). Recently, the deposition of data into open access repositories and the resulting assignment of a data-DOI to the data or fileset has started to be increasingly adopted, and in the near future probably mandated by funders. Unfortunately, mechanisms for the retrieval and application of the data from such sources are still inherited from those developed for journal articles. We argue these mechanisms are not fit for (data) purpose. In these three demonstrations, we show how existing standards can be used to automate the data retrieval process, starting purely from the DOI assigned to the objects. The first of these utilises the 10320/loc method (see doi:10.1021/ci500302p) which is flexible and efficient, but is not supported by the DataCite registry. The next two schemes were developed to achieve such interoperability, the first using the DataCite Media API and the second exploiting added metadata such as relatedMetadataScheme = ORE to use the repository ORE resource map. We have embedded these methods into a Javascript-based data viewing demonstrator (JSmol), which is designed to display molecular information. Handlers for other types of data could be readily incorporated, and the system could also be exploited for data-mining. Examples of recently published journal articles which use such data-DOI handling will be cited. |
Issue Date: | 11-Dec-2014 |
URI: | http://hdl.handle.net/10044/1/30149 |
DOI: | https://dx.doi.org/10.6084/m9.figshare.1266197.v1 |
Keywords: | persistent doi Information Systems Cheminformatics |
Appears in Collections: | Faculty of Natural Sciences - Research Data |