I have crawler that crawls urls from website containing RDF data. I tried to get it with Jena like this
Model model = ModelFactory.createDefaultModel();
model.read(url);
model.write(System.out);
url
is String
and first line gets executed, debugger stops for second line and then it goes back to first line (because of loop). url
is web page link. I have also tried to get html code of page, and than pass that string to read
function, but it didn't work either.
I'm really a rookie to RDF and Jena, and my Java experience isn't really extensive, so any help is good.
The code you've got for reading a model from a url
is correct. For instance, here's a complete example that reads one of the examples from section 2.13 Typed Node Elements of the RDF/XML specification:
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
public class RetrieveRemoteRDF {
public static void main(String[] args) {
final String url = "http://www.w3.org/TR/REC-rdf-syntax/example14.nt";
final Model model = ModelFactory.createDefaultModel();
model.read(url);
model.write(System.out);
}
}
The output (in the default RDF/XML serialization) is:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:j.0="http://example.org/stuff/1.0/" >
<rdf:Description rdf:about="http://example.org/thing">
<dc:title>A marvelous thing</dc:title>
<rdf:type rdf:resource="http://example.org/stuff/1.0/Document"/>
</rdf:Description>
</rdf:RDF>
If you're encountering a problem, it seems like it must be due to the url
that is getting passed to model.read
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.