[英]Java to create XML and using XSL to create HTML escaped characters
I have a bit of an issue 我有一个问题
Now i have an issue with \\r
and \\n
and some other funky symbols. 现在,我遇到了
\\r
和\\n
以及其他一些时髦符号的问题。 Should i parse the content of my xml with xml escapes or html escapes. 我应该用xml转义还是html转义来解析xml的内容。 The default Java escape utility class is doing a piss poor job of it and the custom class i found online isn't working either.
默认的Java转义实用程序类对此做得很差,而我在网上发现的自定义类也不起作用。
Would a good solution be to just replace \\n
and \\r
with <p> </p>
or what html tag
would be a good choice? 一个好的解决方案是将
\\n
和\\r
替换为<p> </p>
还是什么html tag
是一个好的选择? Thank you! 谢谢!
A simple example would be my date value in my xml which was passed in as a string and all escapes were used. 一个简单的示例是我在xml中的日期值,该日期值以字符串形式传递,并且使用了所有转义符。
Original: (same time, i don't remember which) - Mon, 29 Feb 2016 13:40:58 EST (-0500)
原文:(同一时间,我不记得是哪个)-
Mon, 29 Feb 2016 13:40:58 EST (-0500)
Escaped XML entry: - <Date>Mon&#044; 29 Feb 2016 03&#058;40&#058;43 EST&#040;&#045;0500&#041;</Date>
转义的XML条目: -
<Date>Mon&#044; 29 Feb 2016 03&#058;40&#058;43 EST&#040;&#045;0500&#041;</Date>
<Date>Mon&#044; 29 Feb 2016 03&#058;40&#058;43 EST&#040;&#045;0500&#041;</Date>
Parsed HTML output: - Mon, 29 Feb 2016 03:40:43 EST(-0500)
解析的HTML输出:
Mon, 29 Feb 2016 03:40:43 EST(-0500)
Mon, 29 Feb 2016 03:40:43 EST(-0500)
Something clearly went wrong in the encoding and decoding of the special characters. 特殊字符的编码和解码显然出了问题。 but when this is parsed into html
但是当将其解析为html时
EDIT: I also have this junk which i don't even recognize was: 
编辑:我也有这个垃圾,我什至不知道是:

EDIT: I fixed the date issue but it's still not encoding properly in parts. 编辑:我解决了日期问题,但它仍然不能正确编码部分。
public static String entityEncode(String text) {
String result = text;
if (result == null)
return result;
return StringEscapeUtils.escapeXml(XMLStringUtil.escapeControlChrs(result));
}
And the other class is: 另一类是:
public class XMLStringUtil {
private static HashSet<Character> illegalChrSet = new HashSet<>();
static {
final String illegalChrs = "\u0000\u0001\u0002\u0003\u0004\u0005" +
"\u0006\u0007\u0008\u000B\u000C\u000E\u000F\u0010\u0011\u0012" +
"\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001A\u001B\u001C" +
"\u001D\u001E\u001F\uFFFE\uFFFF";
for (int i=0; i < illegalChrs.length(); i++) {
illegalChrSet.add(illegalChrs.charAt(i));
}
}
public static String escapeControlChrs(String str) {
if (str == null) {
return null;
}
StringBuilder sb = new StringBuilder(str.length());
for (int i=0; i < str.length(); i++) {
char chr = str.charAt(i);
if (illegalChrSet.contains(chr)) {
sb.append("\\x");
sb.append(String.format("%04x", (int) chr));
} else {
sb.append(chr);
}
}
return sb.toString();
}
public static String removeControlChrs(String str) {
if (str == null) {
return null;
}
StringBuilder sb = new StringBuilder(str.length());
for (int i=0; i < str.length(); i++) {
char chr = str.charAt(i);
if (! illegalChrSet.contains(chr)) {
sb.append(chr);
}
}
return sb.toString();
}
but i still get this junk in the xml: 但是我仍然在xml中得到这个垃圾:
<Info>The origin domain used for comparison was: 
google.ca.ca
blah blah blah
</Info>
It occurs on new lines. 它发生在新行上。
The problem is when you are encoding to xml itself. 问题是当您编码为xml本身时。 HTML is parsing the values properly.
HTML正在正确解析值。 For html & is &.
对于html&是&。 Please check how you are encoding to xml.
请检查您如何编码为xml。 XML should not be having all those ascii chars.
XML不应具有所有这些ascii字符。
basically your string is having the character '/'. 基本上,您的字符串具有字符“ /”。 when encoded it is getting converted to for xml.
编码后,它已转换为xml。 This is not known to html.
html未知。 Either when creating xml replace '/' with / and when decoded html will automatically convert to '/'
创建xml时,用/替换'/',并且解码的html会自动转换为'/'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.