简体   繁体   中英

Crafting custom ontology/RDF graph from Dbpedia

I have used Construct Sparql query to extract an RDF graph from dbpedia. Now the problem is that the extracted graph stores the properties (eg birthplace) associated with the individuals (eg Maria Sharapova) as annotations to the individual. How can I specify properties while constructing the graph? My code:

String service="http://dbpedia.org/sparql";String queryString =
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> "+
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> "+
"PREFIX dbp: <http://dbpedia.org/property/> "+
"PREFIX dbpedia: <http://dbpedia.org/resource/> "+
    "PREFIX dbo: <http://dbpedia.org/ontology/> " +
    "CONSTRUCT { ?entity rdfs:label ?label . ?entity rdf:type dbo:TennisPlayer. ?entity dbp:birthPlace ?b.} " + 
    "where { "+
    "?entity rdfs:label ?label ."+
            "?entity rdf:type dbo:TennisPlayer."+
            "?entity dbp:birthPlace ?b." +             
"FILTER (lang(?label) = 'en')."+
    "}\n"
Query query = QueryFactory.create(queryString);
System.out.println("Query Result Sheet");
QueryExecution qe =QueryExecutionFactory.sparqlService(service, query);
Model results =  qe.execConstruct();
results.write(System.out, "RDF/XML");
String fileName = "D:/tayybah/GATE data/GATE ontologies/dbpedia domain.owl";
FileWriter outfile = new FileWriter( fileName );
results.write(outfile, "RDF/XML");

The way I've typically done this in the past (eg, converting an RDFS vocabulary to SKOS in Is there a way to convert the data format of an RDF vocabulary to SKOS ) is to use one or more values blocks to specify the each DBpedia type or property and its corresponding type or property in my own ontology. For instance, a construct query like:

construct {
  ?entity a ?mytype ; ?myprop ?value 
}
where {
  values (?dbtype ?mytype) {
    (dbpedia-owl:TennisPlayer my:PlayerOfTennis)
  }

  values (?dbprop ?myprop) {
    (rdfs:label my:label)
    (dbpprop:birthPlace my:placeOfBirth)
  }

  ?entity a ?dbtype ; ?dbprop ?value .
}

This grabs each entity of type ?dbtype and each of its values for ?dbprop , and constructs a graph with a corresponding entity of type ?mytype with the same value for the proeprty ?myprop . As Rob Hall pointed out , Protege will still think that my:label and my:placeOfBirth are annotation properties unless you include declarations for them. You can actually do that, too in the construct query. If you just add to what you have in your values block, this can be as easy as, eg,

construct {
  ?entity a ?type ; ?myprop ?value .
  ?myprop a ?proptype
}
where {
  values (?dbprop ?myprop ?proptype) {
    (dbpprop:birthPlace my:placeOfBirth owl:ObjectProperty)
    #-- ... 
  }
}

You can use the same approach to add in language filters, too. Just remember that you can use the keyword undef in values blocks, and that you can check in filters with unbound . Thus, as the final example, you can do:

prefix my: <http://example.org/vocab/>

construct {
  ?entity a ?mytype ; ?myprop ?value .
  ?mytype a owl:Class .
  ?myprop a ?proptype .
}
where {
  values (?dbtype ?mytype) {
    (dbpedia-owl:TennisPlayer my:PlayerOfTennis)
  }

  values (?dbprop ?myprop ?lang ?proptype) {
    (dbpedia-owl:birthPlace my:placeOfBirth undef owl:ObjectProperty)
    (rdfs:label my:label 'en' owl:AnnotationProperty)
  }

  ?entity a ?dbtype ; ?dbprop ?value .
   filter ( !bound(?lang) || langMatches(lang(?value),'en') )
}

SPARQL results

@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ns1:    <http://example.org/vocab/> .
@prefix dbpedia:    <http://dbpedia.org/resource/> .
@prefix owl:    <http://www.w3.org/2002/07/owl#> .

dbpedia:Maaike_Smit rdf:type    ns1:PlayerOfTennis ;
    ns1:label   "Maaike Smit"@en ;
    ns1:placeOfBirth    dbpedia:Netherlands ,
        dbpedia:Emmeloord .
dbpedia:Alexander_Kudryavtsev   rdf:type    ns1:PlayerOfTennis ;
    ns1:label   "Alexander Kudryavtsev"@en ;
    ns1:placeOfBirth    dbpedia:Yekaterinburg .
dbpedia:Alexander_Zverev    rdf:type    ns1:PlayerOfTennis ;
    ns1:label   "Alexander Zverev"@en ;
    ns1:placeOfBirth    dbpedia:Sochi .
dbpedia:Alina_Jidkova   rdf:type    ns1:PlayerOfTennis ;
    ns1:label   "Alina Jidkova"@en ;
    ns1:placeOfBirth    dbpedia:Soviet_Union ,
        dbpedia:Moscow .

# ...

ns1:PlayerOfTennis  rdf:type    owl:Class .
ns1:label   rdf:type    owl:AnnotationProperty .
ns1:placeOfBirth    rdf:type    owl:ObjectProperty .

With Jena in Java

import org.apache.jena.riot.Lang;
import org.apache.jena.riot.RDFDataMgr;

import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;

public class DBpediaOntologyMappingExample {
    public static void main(String[] args) {
        String query = "\n"
                + "prefix owl: <http://www.w3.org/2002/07/owl#>\n"
                + "prefix dbpedia-owl: <http://dbpedia.org/ontology/>\n"
                + "prefix my: <http://example.org/vocab/>\n"
                + "prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n"
                + "\n"
                + "construct {\n"
                + "  ?entity a ?mytype ; ?myprop ?value .\n"
                + "  ?mytype a owl:Class .\n"
                + "  ?myprop a ?proptype .\n"
                + "}\n"
                + "where {\n"
                + "  values (?dbtype ?mytype) {\n"
                + "    (dbpedia-owl:TennisPlayer my:PlayerOfTennis)\n"
                + "  }\n"
                + "  values (?dbprop ?myprop ?lang ?proptype) {\n"
                + "    (dbpedia-owl:birthPlace my:placeOfBirth undef owl:ObjectProperty)\n"
                + "    (rdfs:label my:label 'en' owl:AnnotationProperty)\n"
                + "  }\n"
                + "  ?entity a ?dbtype ; ?dbprop ?value .\n"
                + "  filter ( !bound(?lang) || langMatches(lang(?value),'en') )\n"
                + "}\n"
                + "limit 100";

        String dbpedia = "http://dbpedia.org/sparql";
        QueryExecution exec = QueryExecutionFactory.sparqlService( dbpedia, query );
        RDFDataMgr.write( System.out, exec.execConstruct(), Lang.N3 );
    }
}

The graph shouldn't modify the properties in any way when created. When you view the graph in protege, however, it has no knowledge how to interpret the properties, and calls them annotations by default.

You have two options. Include a schema for DBPedia when interpreting the triples you extract (so protege can see triples describing the properties) or extract triples along with your construct query.

You don't seem to require definitions for anything beyond the dbp:bithPlace property, so the most trivial solution is to just add the missing triple manually:

final Model results =  qe.execConstruct();
results.add(results.getResource("http://dbpedia.org/property/birthPlace"),
            RDF.type, OWL.DatatypeProperty);

If you were to use many properties, or if you didn't know what properties you were going to retrieve, then you should either include the schemas directly (loading them into Jena models) or run a modified query that can pull properties about the properties.

EDIT

The following example shows just adding schema data directly to the results when necessary.

final Model schema = ModelFactory.createDefaultModel();
// TODO populate schema
// ...
final Model results = qe.execConstruct();
results.add(schema);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM