Introduction

The Semantic Lancet Project is focused on building a Linked Open Dataset on scholarly publications. The current dataset contains metadata about all papers published in the Journal of Web Semantics by Elsevier.

For each paper, the dataset reports bibliographic data, abstract and citations, expressed in RDF and compliant with the Semantic Publishing and Referencing (SPAR) Ontologies. SPAR ontologies are ontological modules that allow one to describe of the various parts of the publishing domain in RDF such as article metadata and citations bibliographic references and citation contexts, person's roles and document statuses, document components and publishing workflows.

Our goal to make available the RDF triplestore (accompanied by a SPARQL endpoint) for research purposes, as well as a series of services to access, explore and make use of scholarly data.

The dataset is released under the terms of the Creative Commons Attribution 4.0 International.

Data and Services

SPARQL endpoint
The SPARQL endpoint (live embedded interface and REST API) for accessing all the date in Semantic Lancet.
Data browser
It is an interactive and user-friendly interface that allows users to easily access papers, authors, citations, and so on. There are two main issues addressed in designing the tool: the huge amount of information users are required to deal with and the complexity of that information. The solution we propose is to hide the intrinsic complexities of the data and of the underlying technologies, giving users an higher-level views over the dataset content.
BEX - Bibliographic EXplorer
BEX is an interactive web-based application aimed at supporting analysis and exploration of papers and citations available in the Semantic Lancet Triplestore. The BEX design was driven by Shneiderman's Information Seeking Mantra: Overview first, zoom and filter, then details-on-demand.
Abstract finder
It is an interactive and user-friendly interface, which allows one to search paper abstract by exploiting the semantic information about concepts, events, roles and named entities contained in the semantic abstracts. The modules works in two phases: first, it creates a semiotic index of the abstracts with respect to a taxonomy of types. These types are aligned to WordNet synsets and DBpedia resources; thus, abstracts are not indexed for their plain textual content but for the concepts represented in that content in a semiotic fashion; second, a simple interface allows users to browse these concepts and the abstracts exposing them.
Citation explorer
It is an interactive web based tool for analysing and making sense of the citations. The tool is composed by three modules. Once selected a particular journal, the top-left area shows the overview of all papers available (i.e., the circles) for the journal and their citation network. The second module, shown in the right part, summarises the functions (depicted by using different colours) of incoming and outgoing citations related to each paper (shown in the bottom) and each author (top). This tool provides new elements to evaluate citations: for example, users can grasp the different impact of an article referenced several times as an authority from another cited more times but for a generic reason (e.g., for information or as a related work). The last module, shown on the left-bottom, allows users to compare the activity of two authors by showing the distribution of their citations (and the related functions) on a time axis. This time-aware perspective can be useful to investigate the role played by an author within the particular journal under consideration.
Web Data Reporter
It is a Web application that queries both scholarly data and provenance data in order to present such kinds of situations in form of a Web page. The current implementation allows us to spot: potential mistakes in the datasets, data incompleteness, and data duplication.

People