简体   繁体   中英

SPARQL Query to escape emojis?

I'm using SPARQL query to extract instances which is valid.

But using this query, I can also get instances which name contains emoticon (eg, http://ko.dbpedia.org/resource/😼 ), and it gives me an error while iterating over the query resultsets. How can I escape from emojis?

SELECT DISTINCT ?s WHERE {
?s ?p ?o 

FILTER regex(str(?s), "^http://ko.dbpedia.org/resource")
}

ORDER BY DESC(?s)
limit 100

Error message is as follows

Exception in thread "main" com.hp.hpl.jena.shared.JenaException: Convert results are FAILED.:virtuoso.jdbc4.VirtuosoException: Virtuoso Communications Link Failure (timeout) : malformed input around byte 34
    at virtuoso.jena.driver.VirtuosoQueryExecution$VResultSet.moveForward(VirtuosoQueryExecution.java:498)
    at virtuoso.jena.driver.VirtuosoQueryExecution$VResultSet.hasNext(VirtuosoQueryExecution.java:441)
    at kr.ac.kaist.dm.BBox.TypeInference.LoadTriple.processTriples(LoadTriple.java:92)
    at kr.ac.kaist.dm.BBox.TypeInference.TypeInferenceMain.main(TypeInferenceMain.java:110)

Sample Code is as follows.

VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create(sparql, set);
        ResultSet results = vqe.execSelect();

        int i = 0;
        while(results.hasNext()){        //  <-----  LoadTriple.java:92 here. 

I just posted the extended version of this question on virtuoso-opensource issue #543 .


I just want to escape from emoji rather than including all possible characters like "FILTER regex(?s, \\"[a-zA-Z가-힣~!@#$%^&*()-_=+|'<>]+\\") }"

ENCODE_FOR_URI() should work:

FILTER regex(ENCODE_FOR_URI(str(?s)), "^http://ko.dbpedia.org/resource")

...though you would also need to URI encode the regex match string:

http%3A%2F%2Fko.dbpedia.org%2Fresource

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM