简体   繁体   中英

Jena SDB IRI validation

I have got several strange IRIs that I want to insert into Jena SDB, but I got some error messages:

  1. http://example.org/text/1234#offset_2311_2317_10-12%
    the error message is:
    Code: 30/ILLEGAL_PERCENT_ENCODING in FRAGMENT: The host component a percent occurred without two following hexadecimal digits.
  2. http://example.org/text/5678#offset_365_370_NDZ#2
    the error message is:
    Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
  3. http://example.org/text/7890#offset_8872_8878__ "Fren
    the error message is:
    Code: 4/UNWISE_CHARACTER in FRAGMENT: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.

The string 10-12%, NDZ#2 and _"Fren are extracted from plain text document and I have to attach it directly at the back of the IRIs. So my question is: are they valid IRIs? If not, considering I need to attach plain text at the back of IRIs, how can I convert them to valid IRIs?

1 is wrong because it ends in % -- % is for hex encoding so it must be %xx

Encode the % -- use %25

2 is wrong because it has two fragments. USe %23 is you mean # as a charcater, not as a fragment

3 has " in it. Encode that.

Spaces are a bad idea as well. Use %20.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM