简体   繁体   English

如何将上传的文件从 charset=iso-8859-1 转换为 charset=utf-8?

[英]How to convert uploaded file from charset=iso-8859-1 to charset=utf-8?

When user uploads a file that has charset=iso-8859-1 it comes with question marks and gibberish.当用户上传具有 charset=iso-8859-1 的文件时,它带有问号和乱码。

I have seen that there online web converts it successfully to utf-8 - so after uploading the file after this conversion the file is getting uploaded properly.我已经看到在线 web 将其成功转换为 utf-8 - 因此在此转换后上传文件后,文件正在正确上传。 This is the web: https://subtitletools.com/convert-text-files-to-utf8-online这是 web: https://subtitletools.com/convert-text-files-to-utf8-online

This is my code:这是我的代码:

const file = document.getElementById('some-id').files[0];

const reader = new FileReader();

reader.onloadend = event => {
  let data = event.target.result;
  console.log(`[data]:`, data); // question marks / gibberish
}

reader.readAsText(file);

I have also tried to use reader.readAsBinaryString but got gibberish instead of question marks.我也尝试过使用reader.readAsBinaryString但得到的是乱码而不是问号。

I have also tried to use the utf8 library: https://www.npmjs.com/package/utf8 but it didn't work.我也尝试过使用 utf8 库: https://www.npmjs.com/package/utf8但它没有用。

How the site that I mentioned above achieves to convert the file to the desired charset so its data is not in question marks or gibberish?我上面提到的网站如何将文件转换为所需的字符集,使其数据不是问号或乱码? BTW also Google Drive does it well.顺便说一句,Google Drive 也做得很好。

You can use TextDecoder with you own charset您可以将TextDecoder与您自己的字符集一起使用

var data = new TextDecoder('iso-8859-1').decode(await file.arrayBuffer())

guess the harder part is to figure out what charset it's猜猜更难的部分是弄清楚它是什么字符集

If you are not using async/await then you can instead do this:如果你没有使用async/await那么你可以这样做:

const file = document.getElementById('some-id').files[0];
file.arrayBuffer().then(ab => {
  const data = new TextDecoder('iso-8859-1').decode(ab)
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM