[英]Change string encoded in win1250 to utf8
I'm loading a file that has encoding win1250, but when I load it, it has characters like p jemce
instead of příjemce
(note diacritics.) 我正在加载一个编码为win1250的文件,但是当我加载该文件时,它具有诸如
p jemce
而不是příjemce
(注意变音符号)。
I'd like to change the encoding FROM win1250 TO UTF8. 我想将编码从win1250更改为UTF8。
I managed to do it in PHP with the following code 我设法用以下代码在PHP中做到了
$content = iconv('windows-1250', 'UTF-8', $content);
but I am unable to do it in Javascript. 但我无法用Javascript完成。 I need to do this encoding on client without sending it to server (so I can't use PHP as "encoding proxy")
我需要在客户端上执行此编码而不将其发送到服务器(因此我不能将PHP用作“编码代理”)
I've tried to use libraries iconv-lite
and text-encoding
(on NPM) like this 我试图像这样使用库
iconv-lite
和text-encoding
(在NPM上)
var reader = new FileReader();
reader.onload = () => {
var data = reader.result;
// iconv-lite
var buf = iconv.encode(data, 'win1250');
var str1 = iconv.decode(new Buffer(buf), 'utf8');
// text-encoding
var uint8array = new TextEncoder('windows-1250').encode(data);
var str2 = new TextDecoder('utf-8').decode(uint8array);
console.log(str1);
console.log(str2);
};
reader.readAsText(file);
But neither has actually correctly changed the encoding. 但实际上都没有正确更改编码。 Is there anything I'm missing?
有什么我想念的吗?
I think you could simply try reader.readAsArrayBuffer
我认为您可以简单地尝试
reader.readAsArrayBuffer
var reader = new FileReader();
reader.onload = () => {
var buf = reader.result;
// iconv-lite
var str1 = iconv.decode(buf, 'win1250');
// text-encoding
var str2 = new TextDecoder('windows-1250').decode(buf);
console.log(str1);
console.log(str2);
};
reader.readAsArrayBuffer(file);
If readAsArrayBuffer
should get the binary data directly. 如果
readAsArrayBuffer
应该直接获取二进制数据。
I don't have the entire dev environment so the above code is not fully tested, hope it could at least be inspirational. 我没有完整的开发环境,因此上述代码尚未经过全面测试,希望至少可以鼓舞人心。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.