How to parse a big rdf file in rdf4j

Question

I want to parse a huge file in RDF4J using the following code but I get an exception due to parser limit;

public class ConvertOntology {

    public static void main(String[] args) throws RDFParseException, RDFHandlerException, IOException {

        String file =  "swetodblp_april_2008.rdf";
        File initialFile = new File(file);
        InputStream input = new FileInputStream(initialFile);
        RDFParser parser = Rio.createParser(RDFFormat.RDFXML);
        parser.setPreserveBNodeIDs(true); 
        Model model = new LinkedHashModel();
        parser.setRDFHandler(new StatementCollector(model));
        parser.parse(input, initialFile.getAbsolutePath());
        FileOutputStream out = new FileOutputStream("swetodblp_april_2008.nt");
            RDFWriter writer = Rio.createWriter(RDFFormat.TURTLE, out);
        try {
          writer.startRDF();
          for (Statement st: model) {
                    writer.handleStatement(st);
          }
          writer.endRDF();
        }
        catch (RDFHandlerException e) {
        }
        finally {
          out.close();
        }

    }

The parser has encountered more than "100,000" entity expansions in this document; this is the limit imposed by the application.

I execute my code as following as suggested on the RDF4J web site to set up the two parameters (as in the following command)

mvn -Djdk.xml.totalEntitySizeLimit=0 -DentityExpansionLimit=0 exec:java

any help please

Answer 1

The error is due to the Apache Xerces XML parser, rather than the default JDK XML parser. So Just delete Xerces XML folder from you .m2 repository and the code works fine.

How to parse a big rdf file in rdf4j

Question

1 answers

solution1
0 ACCPTED 2020-02-02 05:09:15

How to parse a big rdf file in rdf4j

Question

1 answers

solution1 0 ACCPTED 2020-02-02 05:09:15

solution1
0 ACCPTED 2020-02-02 05:09:15