简体   繁体   中英

Use Apache Jena to get RDF from url

I have crawler that crawls urls from website containing RDF data. I tried to get it with Jena like this

Model model = ModelFactory.createDefaultModel();
model.read(url);
model.write(System.out);

url is String and first line gets executed, debugger stops for second line and then it goes back to first line (because of loop). url is web page link. I have also tried to get html code of page, and than pass that string to read function, but it didn't work either.

I'm really a rookie to RDF and Jena, and my Java experience isn't really extensive, so any help is good.

The code you've got for reading a model from a url is correct. For instance, here's a complete example that reads one of the examples from section 2.13 Typed Node Elements of the RDF/XML specification:

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;

public class RetrieveRemoteRDF {
    public static void main(String[] args) {
        final String url = "http://www.w3.org/TR/REC-rdf-syntax/example14.nt";
        final Model model = ModelFactory.createDefaultModel();
        model.read(url);
        model.write(System.out);
    }
}

The output (in the default RDF/XML serialization) is:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:j.0="http://example.org/stuff/1.0/" > 
  <rdf:Description rdf:about="http://example.org/thing">
    <dc:title>A marvelous thing</dc:title>
    <rdf:type rdf:resource="http://example.org/stuff/1.0/Document"/>
  </rdf:Description>
</rdf:RDF>

If you're encountering a problem, it seems like it must be due to the url that is getting passed to model.read .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM