简体   繁体   中英

How to convert uploaded file from charset=iso-8859-1 to charset=utf-8?

When user uploads a file that has charset=iso-8859-1 it comes with question marks and gibberish.

I have seen that there online web converts it successfully to utf-8 - so after uploading the file after this conversion the file is getting uploaded properly. This is the web: https://subtitletools.com/convert-text-files-to-utf8-online

This is my code:

const file = document.getElementById('some-id').files[0];

const reader = new FileReader();

reader.onloadend = event => {
  let data = event.target.result;
  console.log(`[data]:`, data); // question marks / gibberish
}

reader.readAsText(file);

I have also tried to use reader.readAsBinaryString but got gibberish instead of question marks.

I have also tried to use the utf8 library: https://www.npmjs.com/package/utf8 but it didn't work.

How the site that I mentioned above achieves to convert the file to the desired charset so its data is not in question marks or gibberish? BTW also Google Drive does it well.

You can use TextDecoder with you own charset

var data = new TextDecoder('iso-8859-1').decode(await file.arrayBuffer())

guess the harder part is to figure out what charset it's

If you are not using async/await then you can instead do this:

const file = document.getElementById('some-id').files[0];
file.arrayBuffer().then(ab => {
  const data = new TextDecoder('iso-8859-1').decode(ab)
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM