简体   繁体   中英

Best way to validate non-printable ascii characters in XML

Application needs to validate the different input XML(s) messages for non-printable ascii characters. We currently know two options to do this.

  1. Change the XSD to include the restriction.

  2. Validate the input xml string in java application using Regular Expression

Which approach is better in terms of performance as our application has to return the response within a few seconds? Is there any other option available to do this?

It's mainly a matter of opinion but if you have an XSD that seems to be the natural place to include the validations. The only thing you may need to consider is that via XSD you will either fail or pass, whereas with ad-hoc java validation you can ignore non-printable, or replace or take an action without failing the input completely.

The only characters that are (a) ASCII, (b) non-printable, and (c) allowed in XML 1.0 documents are CR, NL, and TAB. I find it hard to see why excluding those three characters is especially important, but if you already have an XSD schema, then it makes sense to add the restriction there.

The usual approach is not to make these three characters invalid, but to treat them as equivalent to space characters, which you can do by using a data type that has the whitespace facet value "normalize" or "collapse".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM