简体   繁体   中英

How to filter non-resolvable URIs on a SPARQL query?

Is it possibe to filter out results that contains a non-resolvable URI within the SPARQL query?

An example: I'm making the following query (endpoint: http://linkeddata.systems:8890/sparql ):

PREFIX RO: <http://www.obofoundry.org/ro/ro.owl#>
PREFIX SIO: <http://semanticscience.org/resource/>
PREFIX EDAM:  <http://edamontology.org/>
PREFIX PHIO: <http://linkeddata.systems/ontologies/SemanticPHIBase#>
PREFIX PUBMED:  <http://linkedlifedata.com/resource/pubmed/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX up:  <http://purl.uniprot.org/core/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?disn_1 ?label ?rel ?valor 
WHERE { ?disn_1 ?rel ?valor . ?disn_1 rdfs:label ?label FILTER(( ?disn_1 = <http://linkeddata.systems/SemanticPHIBase/Resource/host/HOST_00561>))}

In the results, as you can see there is in ?valor variable a triple that contains a non-resolvable URI (text: /hostncbitaxid/). I would like to know if there is some specific FILTER that can be added in the SPARQL query to remove those results with non-resolvable URIs.

I'm having problems with the API that I'm using to process these results in C# because it is returning an exception due to non-resolvable URIs so I would like to filter them out in the SPARQL query (if possible).

How do you know that it's not resolvable? RDF doesn't have a concept of a "relative URI", all the URIs are resolved relative to something (and perhaps to what is an implementation detail in some cases), so you end up with absolute URIs. In the HTML results from that endpoint, I get http://linkeddata.systems:8890/hostncbitaxid/ , and that could easily be resolvable.

That said, if you are ending up with results that include non-absolute URIs, and you want to filter those out, you could use some heuristics to do that. For instance, if you only want URIs beginning with http , you can do that. Eg, here's a query that returns two values for ?uri :

prefix : <urn:ex:>

select * where {
  values ?uri { <http://www.example.org/> </foobar> }
}
-----------------------------
| uri                       |
=============================
| <http://www.example.org/> |
| <file:///foobar>          |
-----------------------------

(Notice that the relative URI /foobar got resolved as a file:// URI.) You can keep only http URIs with a filter :

prefix : <urn:ex:>

select * where {
  values ?uri { <http://www.example.org/> </foobar> }
  filter strstarts(str(?uri), "http")
}
-----------------------------
| uri                       |
=============================
| <http://www.example.org/> |
-----------------------------

The query returns (SPARQL results in JSON format):

"valor": { "type": "uri", "value": "/hostncbitaxid/" }}

This is bad data - it must be an absolute URI in RDF. Presumably the data is bad. You can remove it in the query as @joshua-taylor shows .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM