简体   繁体   English

从AJAX发布的字符串中删除BOM字符

[英]Removing BOM characters from AJAX-posted string

My content contains multiple BOM (EF BB BF) characters and I want to remove them. 我的内容包含多个BOM(EF BB BF)字符,我想删除它们。 The characters are in the middle of strings I want to simply remove them all. 字符在字符串中间,我只想将它们全部删除。

The data comes from a JavaScript source, which I get from a CKEditor instance. 数据来自JavaScript来源,我来自CKEditor实例。 Then I POST the variable and read it as string on my backend and the BOMS are there. 然后我发布变量,并在后端将其作为字符串读取,并且BOMS在那里。 For now, they are persisted as is, but this results in errors in post-processing when the characters are interpreted and start showing up mid-content. 现在,它们按原样保留,但是当解释字符并开始显示中间内容时,这会导致后期处理出错。 I suspect they come from something that was copypasted into my CKEditor. 我怀疑它们来自复制粘贴到我的CKEditor中的东西。

I can step through the string char by char, but I don't know how to compare against the BOM. 我可以逐字符逐个字符char,但是我不知道如何与BOM进行比较。 Would it somehow be possible to compare the hex values of the string bytes and compare three byte sequences? 是否可以比较字符串字节的十六进制值并比较三个字节序列?

The utf-8 BOM bytes get translated to \ . utf-8 BOM字节将转换为\ Unicode character "Zero width no-break space", can't see them, can't hear them. Unicode字符“零宽度不间断空格”,看不到它们,听不到它们。 Filter them out with: 用以下方法过滤掉它们:

   var good = bad.Replace("\ufeff", "");

请尝试以下操作:

CleanString = DirtyString.Replace("\u00EF\u00BB\u00BF", null);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM