简体   繁体   English

使用fgetcsv将Excel csv导出到php文件中

[英]Excel csv export into a php file with fgetcsv

I'm using excel 2010 professional plus to create an excel file. 我正在使用excel 2010 professional plus来创建一个excel文件。 Later on I'm trying to export it as a UTF-8 .csv file. 后来我试图将其导出为UTF-8 .csv文件。 I do this by saving it as CSV (symbol separated..... sry I know not the exact wording there but I don't have the english version and I fear it is translated differently than 1:1 ). 我这样做是通过将其保存为CSV(符号分隔..... sry我不知道那里的确切措辞,但我没有英文版本,我担心它的翻译方式不同于1:1 )。 There I click on tools->weboptions and select unicode (UTF-8) as encoding. 在那里我点击tools-> weboptions并选择unicode(UTF-8)作为编码。 The example .csv is as follows: 示例.csv如下:

ID;englishName;germanName
1;Austria;Österreich

So far so good, but if I open the file now with my php code: 到目前为止一切顺利,但如果我现在用我的PHP代码打开文件:

 header('Content-Type: text/html; charset=UTF-8');
 iconv_set_encoding("internal_encoding", "UTF-8");
 iconv_set_encoding("output_encoding", "UTF-8");
 setlocale(LC_ALL, 'de_DE.utf8');
 $fp=fopen($filePathName,'r');
 while (($dataRow= fgetcsv($fp,0,";",'"') )!==FALSE)
 {
     print_r($dataRow);
 }
  • I get: sterreich as a result on the screen (as that is the "error" I cut all other parts of the result). 我得到: sterreich作为屏幕上的结果(因为这是“错误”我切断了结果的所有其他部分)。
  • If I open the file with notedpad++ and look at the encoding I see "ANSI" instead of UTF-8. 如果我用notedpad ++打开文件并查看编码,我会看到“ANSI”而不是UTF-8。
  • If I change the encoding in notepad++ to UTF8....the ö,ä,... are replaced by special chars, which I have to correct manually. 如果我将记事本++中的编码更改为UTF8 ....ö,ä,...将被特殊字符替换,我必须手动更正。

If I go another route and create a new UTF-8 file with notedpad++ and put in the same data as in the excel file I get shown "Österreich" on screen when I open it with the php file. 如果我去另一条路线并使用notedpad ++创建一个新的UTF-8文件并输入与excel文件相同的数据,当我用php文件打开它时,我会在屏幕上显示“Österreich”。

Now the question I have is, why does it not function with excel, thus am I doing something wrong here? 现在我的问题是,为什么它不能与excel一起运行,因此我在这里做错了什么? Or am I overlooking something? 还是我忽略了什么?

Edit: As the program will in the end be installed on windows servers provided by customers, a solution is needed where it is not necessary to install additional tools (php libraries,... are ok, but having to install a vm-ware or cygwin,... is not). 编辑: 由于程序最终将被安装在客户提供的Windows服务器上,因此无需安装其他工具(php库,......),但是必须安装vm-ware或cygwin,......不是)。 Also there won't be a excel (or office) locally installed on the server as the customer will be able to upload the .csv file via a file upload dialog (the dialog itself is not part of the problem, as I know how to handle those and additionally the problem itself I stumbled over when I created an excel file and converted it to .csv on a testmachine where excel was locally installed). 此外,服务器上不会有本地安装的excel(或办公室),因为客户可以通过文件上传对话框上传.csv文件(对话本身不是问题的一部分,因为我知道如何当我创建一个excel文件并在excel在本地安装的testmachine上将其转换为.csv时,我偶然发现了这些以及问题本身。

Tnx TNX

From PHP DOC 来自PHP DOC

Locale setting is taken into account by this function. 此功能考虑了区域设置。 If LANG is eg en_US.UTF-8, files in one-byte encoding are read wrong by this function . 如果LANG是例如en_US.UTF-8, 则此函数读取单字节编码的文件错误

You can try 你可以试试

header('Content-Type: text/html; charset=UTF-8');
$fp = fopen("log.txt", "r");
echo "<pre>";
while ( ($dataRow = fgetcsv($fp, 1000, ";")) !== FALSE ) {
    $dataRow = array_map("utf8_encode", $dataRow);
    print_r($dataRow);
}

Output 产量

Array
(
    [0] => ID
    [1] => englishName
    [2] => germanName
)
Array
(
    [0] => 1
    [1] => Austria
    [2] => Österreich
)

I don't know why Excel is generating a ANSI file instead of UTF-8 (as you can see in Notepad++), but if this is the case, you can convert the file using iconv: 我不知道为什么Excel生成ANSI文件而不是UTF-8(正如您在Notepad ++中看到的那样),但如果是这种情况,您可以使用iconv转换文件:

iconv --from-code=ISO-8859-1 --to-code=UTF-8 my_csv_file.csv > my_csv_file_utf8.csv iconv --from-code = ISO-8859-1 --to-code = UTF-8 my_csv_file.csv> my_csv_file_utf8.csv

And for the people from Czech republic: 对于来自捷克共和国的人们:

function convert( $str ) {
    return iconv( "CP1250", "UTF-8", $str );
}
...
while (($data = fgetcsv($this->fhandle, 1000, ";")) !== FALSE) {
$data = array_map( "convert", $data );
...

From what you say, I suspect excel writes an UTF-8 file without BOM , which makes guessing that the encoding is utf-8 slightly trickier. 根据你的说法,我怀疑excel会写一个没有BOM的UTF-8文件,这使得猜测编码是utf-8有点棘手。 You can confirm this diagnostic if the characters appear correctly in Notepad++ when pressing to Format->Encode in UTF-8 (without BOM) (rather than Format->Convert to UTF-8 (without BOM) ). 如果按下Format->Encode in UTF-8 (without BOM) (而不是Format->Convert to UTF-8 (without BOM) ),如果字符在Notepad ++中正确显示,则可以确认此诊断。

And are you sure every user is going to use UTF-8 ? 你确定每个用户都会使用UTF-8吗? Sounds to me that you need something that does a little smart guessing of what your real input encoding is. 听我说你需要一些能够巧妙地猜测你真正的输入编码是什么的东西。 By "smart", I mean that this guessing recognizes BOM-less UTF-8. 通过“智能”,我的意思是这种猜测可以识别无BOM的UTF-8。

To cut to the chase, I'd do something like that : 为了减少追逐,我会做那样的事情:

$f = fopen('file.csv', 'r');

while( ($row = fgets($f)) != null )
    if( mb_detect_encoding($row, 'UTF-8', true) !== false )
        var_dump(str_getcsv( $row, ';' ));
    else
        var_dump(str_getcsv( utf8_encode($row), ';' ));

fclose($f);

Which works because you read the characters to guess the encoding, rather than lazily trusting the first 3 characters : so UTF-8 without BOM would still be recognized as UTF-8. 这是有效的,因为你读取字符来猜测编码,而不是懒惰地信任前3个字符:所以没有BOM的UTF-8仍然会被识别为UTF-8。 Of course if your csv file is not too big you could do that encoding detection on the whole file contents : something like mb_detect_encoding(file_get_contents(...), ...) 当然,如果您的csv文件不是太大,您可以对整个文件内容进行编码检测:类似于mb_detect_encoding(file_get_contents(...), ...)

The problem must be your file encoding, it looks it's not utf-8. 问题必须是你的文件编码,它看起来不是utf-8。

When I tried your example and double checked file that is indeed utf-8, it works for me, I get: 当我尝试你的例子和双重检查确实是utf-8的文件时,它适用于我,我得到:

Array ( [0] => 1 [1] => Austria [2] => Österreich ) 数组([0] => 1 [1] =>奥地利[2] =>Österreich)

Use LibreOffice (OpenOffice), it's more reliable for these sort of things. 使用LibreOffice(OpenOffice),它对于这些东西更可靠。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM