简体   繁体   中英

Reading czech characters from a txt file using PHP

I'm having issues with reading Czech characters from a txt file.

I want to read .txt files containing categories line by line. With general languages I have no issue. I can read the txt file line by line and copy the categories that I want in an array.

But as soon as I want to read a txt file that contains categories in the Czech language I get problems processing the output of my code. The Czech specific characters are coming out rubbish even though the text file is showing the characters correctly.

As an example: The letters ě, č, ů or ř are all outputed as a square or as st\ or other rubish, depending on the way I read the file.

Origionally I use the fgets function to read a line from the text file.

But as this didn't return the correct characters I started testing with adding utf8_encode but whilst that changed some characters it still didn't restore all the characters.

Then I started experimenting with mb_detect_encoding combined with mb_convert_encoding and later read somewhere that fgets could sometimes return incorrect characters so I started testing with file_get_contents. This also didn't solve the issue.

I assume the main issue is with the way I'm reading the txt file as the output from the fgets and file_get_contents functions are garbled from the start.

Can anyone tell me how to read a text file with Czech characters correctly?

Thanks In advance.

Oké I found the solution myself. Just for the case someone else runs into this issue, the txt file was in the wrong coding. The file was in the "UCS-2 Little Endian" coding. After loading the file in Notepad++ I could encode it to the UTF-8 format and that solved the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM