I'm using SPARQL query to extract instances which is valid.
But using this query, I can also get instances which name contains emoticon (eg, http://ko.dbpedia.org/resource/😼
), and it gives me an error while iterating over the query resultsets. How can I escape from emojis?
SELECT DISTINCT ?s WHERE {
?s ?p ?o
FILTER regex(str(?s), "^http://ko.dbpedia.org/resource")
}
ORDER BY DESC(?s)
limit 100
Error message is as follows
Exception in thread "main" com.hp.hpl.jena.shared.JenaException: Convert results are FAILED.:virtuoso.jdbc4.VirtuosoException: Virtuoso Communications Link Failure (timeout) : malformed input around byte 34
at virtuoso.jena.driver.VirtuosoQueryExecution$VResultSet.moveForward(VirtuosoQueryExecution.java:498)
at virtuoso.jena.driver.VirtuosoQueryExecution$VResultSet.hasNext(VirtuosoQueryExecution.java:441)
at kr.ac.kaist.dm.BBox.TypeInference.LoadTriple.processTriples(LoadTriple.java:92)
at kr.ac.kaist.dm.BBox.TypeInference.TypeInferenceMain.main(TypeInferenceMain.java:110)
Sample Code is as follows.
VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create(sparql, set);
ResultSet results = vqe.execSelect();
int i = 0;
while(results.hasNext()){ // <----- LoadTriple.java:92 here.
I just posted the extended version of this question on virtuoso-opensource issue #543 .
I just want to escape from emoji rather than including all possible characters like "FILTER regex(?s, \\"[a-zA-Z가-힣~!@#$%^&*()-_=+|'<>]+\\") }"
ENCODE_FOR_URI()
should work:
FILTER regex(ENCODE_FOR_URI(str(?s)), "^http://ko.dbpedia.org/resource")
...though you would also need to URI encode the regex match string:
http%3A%2F%2Fko.dbpedia.org%2Fresource
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.