简体   繁体   中英

Add basic value to Ontology individuals @Jena

I have an Ontology with some Classes and everything setup to run. What is a good way to fill it up with Individuals and Data?? In Short do a one-way Mapping from Database (as Input) to an Ontology.

public class Main {

static String SOURCE =  "http://www.umingo.de/ontology/bento.owl";

static String NS =  SOURCE+"#";

 public static void main(String[] args) throws Exception {
    OntModel model = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM );

    // read the RDF/XML file
    model.read(SOURCE);

    OntologyPreLoader loader = new OntologyPreLoader();
    model = loader.init(model);

    model.write(System.out,"RDF/XML");

 }
}

My Preloader has a Method init with the goal to copy data from a database into the ontology. Here is the Excerpt.

 public OntModel init(OntModel model) throws SQLException{

          Resource r = model.getResource( Main.NS + "Tag" );
          Property tag_name = model.createProperty(Main.NS + "Tag_Name");
          OntClass tag = r.as( OntClass.class );
          // statements allow to issue SQL queries to the database
          statement = connect.createStatement();
          // resultSet gets the result of the SQL query
          resultSet = statement
              .executeQuery("select * from niuu.tags");
            // resultSet is initialised before the first data set
            while (resultSet.next()) {
              // it is possible to get the columns via name
              // also possible to get the columns via the column number
              // which starts at 1
              // e.g., resultSet.getSTring(2);
              String id = resultSet.getString("id");
              String name = resultSet.getString("name");

              Individual tag_tmp = tag.createIndividual(Main.NS+"Tag_"+id);
              tag_tmp.addProperty(tag_name,name);
              System.out.println("id: " + id);
              System.out.println("name: " + name);
            }

          return model;
      }

Everything is working, but I feel really unsure about this way to preload ontologies. Also every Individual should get its own ID so that i can match it with the database at a later point. Can i simply define a Property ID and add it to every Individual?

I thought about Adding ID to "Thing" as it is the most basic Type in OWL ontologies.

At first sight it seems ok. One tip is to try convert the Jena model into a RDF serialization and run it through Protégé to get a more clear picture on how your ontology mapping looks like.

You can definitely make your own property to describe the id of every individual. Beneath is an example on how you can create a similar property in turtle format.(I did not add the prefixes for OWL and rdfs since they are some common) You can add this in Jena aswell if needed. (or load this into your model in Jena.)

@prefix you: <your domain> .
you:dbIdentificator a owl:DatatypeProperty  .
you:dbIdentificator rdfs:label "<Your database identifcator>"@en  .
you:dbIdentificator rdfs:comment "<Some valuable information if needed>"@en  .
you:dbIdentificator rdfs:isDefinedBy <your domain>  .
you:dbIdentificator rdfs:domain owl:Thing  .

You could also add owl:Thing to every resource, but that is not the best practice because it is a vague definition of a resource. I would look around for vocabularies that defines more what the resource is. Take a look at GoodRelations . It is a very good defined vocabulary that can describe information even though it is not for commercial use. Especially check out the classes there.

Hope that answered some of your question.

Programatically generating URIs is always somewhat unsettling. If you have Guava, use Preconditions to make some fail-fast assertions about what's coming out of the database (so that your code will let you know if it gets out of alignment with your schema). Use the JDK's URLEncoder to ensure that the id you get from the database is converted to a URI-friendly format (Note that if your data contains characters that cannot be printed in xml and have no percent encoding, you'll need to manually handle them).

For your property/column values, use explicitly create the literal. This makes it very clear whether you are using plain literals, language literals, or typed literals:

// If things can have multiple names in multiple languages, for example
tag_tmp.addProperty(tag_name,model.createTypedLiteral(name, "en"));

Note that you may not wish to define your schema so that it implies things about owl:Thing , because that would have implications outside of your domain. Instead, define a domain-specific notion like a :DatabaseResource . Set the domains of your properties to be that and it's subclasses rather than thing. This way the use of your property implies that the subject with within your domain, rather than simply an owl individual (which is implied by the domain of owl:DatatypeProperty anyway).

EDIT : It's absolutely acceptable to create a representation of the database's unique ID and place it into the RDF model. If you are using owl2, you can define an OWL-2 Key on that property for your :DatabaseResource s and keep the same semantics that you had in the database.

EDIT : Noting a portion of your post on the Jena mailing list:

I have a huge MYSQL-Database for read only purpose and want to extract some Data into the Ontology.

I would highly recommend using the TDB Java API to construct a Dataset that backed by your disk. I've worked on very large database exports before, and it's quite possible that your data size won't be tractable otherwise. TDB's indexing requires a lot of disk space, but the memory-mapped IO makes it very difficult to kill due to OOM errors. Finally, once you have constructed the database on disk, you won't have to perform this expensive import operation again (or could at least optimize it).

If you find database creation times to be prohibitive, then you may with to utilize the bulk loader in creative ways. This answer has an example of using the bulk loader from java.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM