简体   繁体   中英

PHP read file with Japanese contents

I'm writing a php script in which I need to data from a CSV file in which some of the contents are written in Japanese. However, I can't get the data to read or display correctly at all.

The file I'm reading is encoded in the iso-8859-1 charset. I also tried using iconv to convert it to a UTF-8 encoded file however doing that seemed to break the data in the file entirely, and the text wouldn't display correctly in any applications afterwards.

Here's the script I'm using right now:

<?php 
    header("Content-Type: text/html; charset=ISO-8859-1"); 
    setlocale(LC_ALL, 'ja_JP.EUC-JP'); 
?>

<!DOCTYPE html>
<html lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <?php

        $row = 1;

        if (($handle = fopen("/srv/http/Japanese/testFile.csv", "r")) !== FALSE) {
            while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
                $row++;
                for ($i = 0; $i < 4; ++$i) {
                    echo $data[$i] . "<br />";
                }
                echo "<br />";
                if ($row > 1000) break;
            }
            fclose($handle);
        } else echo print_r(error_get_last(),true);
    ?>
</body>
</html>

The first two lines of PHP were added to try to fix the issue but it hasn't worked.

The output for a string in the file reading引き込む, 762, 762, 7122 comes out looking like this:

°ú¤­¹þ¤à
762
762
7122

Also, it doesn't seem to be an issue solely with the display of the data. I also tried testing the data with if ($data[$i]) == "引き込む") and it seems to be false even when I do know that's the string being read.

I've also tried using other means of reading files, however no matter which PHP method I'm using to read the file I seem to get the exact same issue.

Any help would be greatly appreciated.

您需要将带有 iconv 的 csv 文件转换为 ja_JP.EUC-JP(并将元标记中的字符集值也设置为此值)或将 csv 转换为 utf8 并设置适当的字符集 (ja_JP.UTF8)。

I wanted to comment but I dont' have points so please forgive me if my answer is incorrect

From what i can find on google and Stackoverflow this seems to be a solution you just have to fit it into you code

This code

setlocale(LC_ALL, 'ja_JP');
$data = array_map('str_getcsv', file('japanese.csv'));
var_dump($data);

works with the following CSV file (japanese.csv, saved in UTF-8) on my local.

日本語,テスト,ファイル
2行目,CSV形式,エンコードUTF-8

The results are

array(2) {
  [0]=>
  array(3) {
    [0]=>
    string(9) "日本語"
    [1]=>
    string(9) "テスト"
    [2]=>
    string(12) "ファイル"
  }
  [1]=>
  array(3) {
    [0]=>
    string(7) "2行目"
    [1]=>
    string(9) "CSV形式"
    [2]=>
    string(20) "エンコードUTF-8"
  }
}

this might help you understand more: Like to other post

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM