简体   繁体   English

PHP-读取非拉丁字符dir /文件名

[英]PHP - Read non-latin character dir/file name

I have some files and dirs (on Windows , but eventually I will run the same script on Mac and Linux) with non-latin characters in names, for example: 我有一些文件和目录(在Windows ,但最终我将在Mac和Linux上运行相同的脚本),其名称中包含非拉丁字符,例如:

Dir name 01 - Проверка - X . 目录名称01 - Проверка - X

I am trying to read that name and print it but without success - I always get 01 - ???????? - X 我正在尝试读取并打印该名称,但没有成功-我总是得到01 - ???????? - X 01 - ???????? - X instead. 01 - ???????? - X

What i have tried: 我尝试过的

$items = scandir('c:/myDir/');
$name = $items[2];

echo mb_detect_encoding($name); // Returns "ASCII"
echo '<br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'Windows-1252');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'ISO-8859-1');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'ISO-8859-15');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'Windows-1252', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'ISO-8859-1', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'ISO-8859-15', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('WINDOWS-1252', 'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('ISO-8859-1',   'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('ISO-8859-15',  'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'WINDOWS-1252', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'ISO-8859-1', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'ISO-8859-15', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

In the result I always have the same line (I base64-encoded it sou you can see it is the same line) 结果我总是有同一行(我对它进行了base64编码,所以你可以看到它是同一行)

ASCII
01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

What can I do about it? 我该怎么办?

PS What I am trying eventually to achieve, I need to compare two directories and when i reed contents of one directory I can not compare it to another because dirs/file names are broken - my script gets name 01 - ???????? - X PS我最终要实现的目标是,我需要比较两个目录,而当我读取一个目录的内容时,由于目录/文件名损坏,我无法将其与另一个目录进行比较-我的脚本的名称为01 - ???????? - X 01 - ???????? - X and obviously can not find such subdir in second (comparing) directory. 01 - ???????? - X ,显然在第二个(比较)目录中找不到这样的子目录。

Actually my previous answer was not right. 其实我以前的回答是不对的。 The problem is that PHP5 does not support UTF-8 for file operations. 问题是PHP5不支持UTF-8进行文件操作。

A work around would be to use something like WFIO , which exposes it's own protocol for file streams and allows PHP to handle UTF-8 characters in file operations. 解决方法是使用类似WFIO的方法 ,该方法为文件流公开它自己的协议,并允许PHP处理文件操作中的UTF-8字符。 You can see in the README that the syntax would be: 您可以在自述文件中看到以下语法:

scandir("wfio://directory")

Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM