简体   繁体   中英

PHP export to binary Excel file - UTF-8 character encoding

I am using this simple function ( taken from here ) to export PHP array into simple binary Excel file. Writing binary Excel file was my requirement.

public static function array_to_excel($input) 
{
    $ret = pack('ssssss', 0x809, 0x8, 0x0, 0x10, 0x0, 0x0);
    foreach (array_values($input) as $lineNumber => $row) 
    {
        foreach (array_values($row) as $colNumber => $data) 
        {
            if (is_numeric($data)) 
            {
                $ret .= pack('sssssd', 0x203, 14, $lineNumber, $colNumber, 0x0, $data);
            } 
            else 
            {
                $len = strlen($data);
                $ret .= pack('ssssss', 0x204, 8 + $len, $lineNumber, $colNumber, 0x0, $len) . $data;
            }
        }
    }
    $ret .= pack('ss', 0x0A, 0x00); 
    return $ret;
}

Then to call this is pretty much simple simple:

Model_Utilities::array_to_excel($my_2d_array);

Function itself works great and is super simple to create simple binary PHP file. The problem I have is with UTF-8 characters. I get strange characters like Ä¡ instead of right characters... Is there a way to set character encoding in my to excel function?

EDIT:

After wading through hundreds of obfuscated Microsoft docs before locating the OpenOffice version of the XLS format spec , I managed to do something.

However, it relies on the BIFF8 format since, as far as I can tell, BIFF5 (the format used by Excel95) has no direct UTF-16 support.

function array_to_excel($input) 
{
    $cells = '';
    foreach (array_values($input) as $lineNumber => $row) 
    {
        foreach (array_values($row) as $colNumber => $data) 
        {
            if (is_numeric($data)) 
            {
                $cells .= pack('sssssd', 0x203, 14, $lineNumber, $colNumber, 0x0, $data);
            } 
            else 
            {
                $data = mb_convert_encoding ($data, "UTF-16LE", "UTF-8");
                $len = mb_strlen($data, "UTF-16LE");
                $cells .= pack('ssssssC', 0x204, 9+2*$len, $lineNumber, $colNumber, 0x0, $len, 0x1).$data;
           }
        }
    }
    return pack('s4', 0x809, 0x0004, 0x0600, // <- this selects BIFF8 format
                      0x10) . $cells . pack('ss', 0x0A, 0x00); 
}

$table = Array (
    Array ("Добрый день", "Bonne journée"),
    Array ("tschüß", "こんにちは。"),
    Array (30, 40));

$xls = array_to_excel($table);
file_put_contents ("sample.xls", $xls);

My (French) PC version of Excel 2007 managed to open the sample file in compatibility mode, Russian and Japanese included. There is no telling how this hack would work on other variants, though.

EDIT (bis) : from the file specs linked above:

Byte Strings (BIFF2-BIFF5)

All Excel file formats up to BIFF5 contain simple byte strings. The byte string consists of the length of the string followed by the character array. The length is stored either as 8bit value or as 16bit value , depending on the current record. The string is not zero-terminated. The encoding of the character array is dependent on the current record .

Record LABEL, BIFF3-BIFF5:

Offset Size Contents
0 2 Index to row
2 2 Index to column
4 2 Index to XF record
6 var. Byte string , 16-bit string length

Unless you generate a much more complex file, I'm afraid BIFF5 is a no go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM