简体   繁体   English

string.replace(fromCharCode() , '') 不能替换字符

[英]string.replace(fromCharCode() , '') cannot replace characters

When I parse the XML, it contains abnormal hex characters.当我解析 XML 时,它包含异常的十六进制字符。 So I tried to replace it with empty space.所以我试图用空白空间替换它。 But it doesn't work at all.但它根本不起作用。

Original character : � 原字:�

hex code : (253, 255)

code :代码 :

xmlData = String.replace(String.fromCharCode(253,255)," ");

retrun xmlData;

I'd like to remove "ýÿ" characters from description.我想从描述中删除“ýÿ”字符。 Is there anyone who have a trouble with replacing hex character to empty space?是否有人在将十六进制字符替换为空白时遇到问题?

Based on the answers, I've modified the code as follows:根据答案,我修改了代码如下:

testData = String.fromCharCode(253,255);
xmlData = xmlData.replace(String.fromCharCode(253,255), " "); 
console.log(xmlData);

but it still shows '�' on the screen..但它仍然在屏幕上显示'''..

Do you know why this still happens?你知道为什么还会出现这种情况吗?

The character code is actually 255 * 256 + 253 = 65533, so you would get something like this:字符代码实际上是 255 * 256 + 253 = 65533,所以你会得到这样的结果:

xmlData = xmlData.replace(String.fromCharCode(65533)," ");

String String.fromCharCode(253,255) is of two characters. String String.fromCharCode(253,255)有两个字符。

You should call replace() on a string instance not on String :您应该在字符串实例上调用replace()而不是String

var testData = String.fromCharCode(253,255);
var xmlData = testData.replace(String.fromCharCode(253,255), " ");
alert(xmlData);

​Working example: http://jsfiddle.net/StURS/2/​工作示例:http: //jsfiddle.net/StURS/2/

Just had this problem with a messed up SQL-dump that contained both valid UTF-8 codes and invalid forcing a more manual conversion.刚刚遇到了一个混乱的 SQL 转储问题,该转储包含有效的 UTF-8 代码和无效的强制更多手动转换。 As the above examples don't address replacement and finding better matches I figured that I put my two cents in here for those that are struggling with similar encoding problems.由于上面的例子没有解决替换和寻找更好的匹配问题,我想我把我的两分钱放在了这里,用于那些正在努力解决类似编码问题的人。 The following code:以下代码:

  1. parses my sql-dump解析我的 sql 转储
  2. splits according to queries根据查询拆分
  3. finds character codes outside the 256 scope查找 256 范围之外的字符代码
  4. outputs the codes and the string with context where the code appears输出代码和带有代码出现的上下文的字符串
  5. replaces the Swedish ÅÄÖ with correct codes using regular expressions使用正则表达式将瑞典语 ÅÄÖ 替换为正确的代码
  6. outputs the replaced string for control输出替换的字符串以进行控制
"use strict";

const readline = require("readline");
const fs = require("fs");

var fn = "my_problematic_sql_dump.sql";
var lines = fs.readFileSync(fn).toString().split(/;\n/);

const Aring = new RegExp(String.fromCharCode(65533) +
    "\\" + String.fromCharCode(46) + "{1,3}", 'g');
const Auml = new RegExp(String.fromCharCode(65533) +
    String.fromCharCode(44) + "{1,3}", 'g');
const Ouml = new RegExp(String.fromCharCode(65533) +
    String.fromCharCode(45) + "{1,3}", 'g');

for (let i in lines){
    let l = lines[i];
    for (let ii = 0; ii < l.length; ii++){
        if (l.charCodeAt(ii) > 256){
            console.log("\n Invalid code at line " + i + ":")
            console.log("Code: ", l.charCodeAt(ii), l.charCodeAt(ii + 1),
                l.charCodeAt(ii + 2), l.charCodeAt(ii + 3))

            let core_str = l.substring(ii, ii + 20)
            console.log("String: ", core_str)

            core_str = core_str.replace(/[\r\n]/g, "")
            .replace(Ouml, "Ö")
            .replace(Auml, "Ä")
            .replace(Aring, "Å")
            console.log("After replacements: ", core_str)
        }
    }
}

The resulting output will look something like this:结果输出将如下所示:

 Invalid code at line 18:
Code:  65533 45 82 65533
String:  �-R�,,LDRALEDIGT', N
After replacements:  ÖRÄLDRALEDIGT', N

 Invalid code at line 18:
Code:  65533 44 44 76
String:  �,,LDRALEDIGT', NULL
After replacements:  ÄLDRALEDIGT', NULL

 Invalid code at line 19:
Code:  65533 46 46 46
String:  �...ker med fam till
After replacements:  Åker med fam till

A few things that I found worth noting:我发现一些值得注意的事情:

  • The 65533 is sometimes followed by a varying number of regular characters that decide the actual character hence the {1,3} 65533后面有时会跟随不同数量的常规字符,这些字符决定实际字符,因此{1,3}
  • The Aring contains a . Aring包含一个. , ie matches anything and needs the additional \\ , 即匹配任何东西并且需要额外的\\

If you need to replace() all characters in the text.如果您需要替换()文本中的所有字符。 (globally) (全球)

 let data = 'Hello' + String.fromCharCode(32,32,32) + 'World' + String.fromCharCode(32,32,32) + '!'; let find = String.fromCharCode(32,32,32) // 3x space let regex = new RegExp(find, 'g'); let updatedData = data.replace(regex, ' _TEXT_ '); alert(updatedData);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM