简体   繁体   中英

How to select predicates and their respective labels in SPARQL?

I am trying to list all predicates of an ontology ( NIF ) and their labels. When not querying the label, it produces 80 results. So, I assume there are 80 predicates with the term 'nif' in them.

Then I added the line containing rdfs:label to the code and it produced no results. So, I wrote the code below to first filter the URI's containing 'nif':

SELECT DISTINCT ?p ?label WHERE{ ?s ?p ?o . FILTER (REGEX(STR(?p), "nif", "i")) . ?p rdfs:label ?label . } ORDER BY ?p

But it did not work. I tried using ?pa rdf:Property instead of the ?s ?p ?o and that did not work either. I then tried the Exist and Values ?p {"nif"} but I was unsuccessful with those two too!

Where am I making mistake?

Used vs. declared properties: In RDF, there is a difference between using a predicate and declaring a predicate. It is possible to use a predicate without declaring it, and it is possible to declare a predicate without using it.

(It is also possible—and common—to declare a predicate in one file, and use it in a different file. This is how RDF enables re-use of a single ontology in different datasets. There may or may not be an owl:imports statement that links the two files.)

To list all predicates used in the default graph:

SELECT DISTINCT ?predicate {
    ?s ?predicate ?o
}
ORDER BY ?predicate

To list all predicates declared in the default graph, we need to consider which schema language is used to declare it. To list predicates declared with RDF Schema :

SELECT ?predicate {
    ?predicate a rdf:Property
}
ORDER BY ?predicate

To list predicates declared with OWL :

SELECT ?predicate ?type {
    VALUES ?type { owl:ObjectProperty owl:DatatypeProperty owl:AnnotationProperty }
    ?predicate a ?type
}
ORDER BY ?predicate

The query above takes into account that OWL has three different types of predicates: object properties, datatype properties, and annotation properties. So we basically query for each of the three.

With this knowledge, it should be possible to find out what predicates are used in the ontology, and what predicates are declared in the ontology.

Now, about labels. The queries above all return the URI—a machine-readable identifier—for the predicates. To also retrieve labels, add ?label to the list of variables in the SELECT clause, and add this to the WHERE { ... } block:

OPTIONAL { ?predicate rdfs:label ?label }

For example:

SELECT ?predicate ?label {
    ?predicate a rdf:Property
    OPTIONAL { ?predicate rdfs:label ?label }
}
ORDER BY ?predicate

We make the pattern that retrieves the label optional, so if no label is provided in the default graph, the predicate is still returned but without a value for the ?label variable. This way, one can identify cases where a predicate exists (that is, it is either used or declared) but no label is provided.

If a predicate is declared but no label provided, then I would assume it's a low-quality ontology where insufficient care has been taken in its creation.

If a predicate is used but no label provided, I would not be surprised at all. It might just mean that the declaration and label is provided in a different file, and one needs to find that file and add it to the dataset in order to query labels.

Constructing labels from the URI: If the problem is a lack of labels in the ontology, and the labels also cannot be found elsewhere, then here is a version that constructs a best-effort label from the last part of the URI in case no label is declared:

OPTIONAL {
    ?predicate rdfs:label ?tmpl
}
BIND (coalesce(?tmpl, replace(replace(replace(str(?predicate), '.*[#/:]', ''), '_', ' '), '([a-z])([A-Z])', '$1 $2')) AS ?label)

This takes everything after the last hash, slash or colon in the URI, replaces underscores with spaces, and inserts spaces between words in CamelCase notation.

Lastly, filtering by URI. Here it is important to be aware that filtering will only happen on the “raw” URI, and not on the prefix-abbreviated form. For example, the following filter accepts only predicates with rdfs in the URI:

FILTER regex(str(?predicate), 'rdfs', 'i')

But it would actually reject rdfs:label , rdfs:comment and any other properties in the rdfs namespace, because their full URIs are of the form

<http://www.w3.org/2000/01/rdf-schema#label>

so the URI actually doesn't contain the string rdfs . Something to keep in mind.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM