I have a php script that reads a csv file (it has UTF-16LE encoding). The problem is that at some lines the array of php reading the lines of the csv is collapsed because of some Greek characters. A example is bellow (there are 7 elements at the array and the bellow has only 2), how can I solve this problem?
Array ( [0] => 205198 [1] => Label 4.2 Βάση για Σ▒ )
My code is bellow
$array = file_get_contents($this->listUrl);
$array = mb_convert_encoding($array, 'UTF8', 'UTF-16LE'); // Convert the file to UTF8
$array = preg_split("/\R/", $array); // Split it by line breaks
$array = array_map(function ($v) {
return str_getcsv($v, ";");
}, $array);
[edit]I used the code below
$array = str_getcsv($array, "\n");
foreach ($array as &$Row) {
$Row = str_getcsv($Row, ";");
}
My best bet is that :
You need mb_split , since you are messing with multibyte strings to support GR lang.
Some theory :
UTF-8, with the famous ASCII = 1 byte.
UTF-16 with all unicode characters support = 4 bytes.
Some action :
"mb_split — Split multibyte string using regular expression" : PHP : mb_split
There are also similar functions as mb_ereg_replace .
Example :
$array = file_get_contents($this->listUrl);
$array = mb_convert_encoding($array, 'UTF8', 'UTF-16LE'); // Convert the file to UTF8
$array = mb_split("/\R/", $array); // Split it by line breaks
$array = array_map(function ($v) {
return str_getcsv($v, ";");
}, $array);
Have fun !
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.