繁体   English   中英

使用 PHP 自动将 HTML 表格转换为 CSV?

[英]Converting HTML Table to a CSV automatically using PHP?

我只需要使用 PHP 在 csv 中自动转换这个 html 表。 有人可以提供任何想法如何做到这一点吗? 谢谢。

$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

伙计们,我只需要$table转换成.csv文件,它可以使用一些 PHP 函数自动生成。 我们可以将该 csv 文件的路径定义为/test/home/path_to_csv

您可以使用str_get_html http://simplehtmldom.sourceforge.net/

include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

$html = str_get_html($table);



header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element)
{
        $th = array();
        foreach( $element->find('th') as $row)  
        {
            $th [] = $row->plaintext;
        }

        $td = array();
        foreach( $element->find('td') as $row)  
        {
            $td [] = $row->plaintext;
        }
        !empty($th) ? fputcsv($fp, $th) : fputcsv($fp, $td);
}


fclose($fp);

您可以在单独的 js 文件中使用此功能:

function exportTableToCSV($table, filename) {

        var $rows = $table.find('tr:has(td)'),

            // Temporary delimiter characters unlikely to be typed by keyboard
            // This is to avoid accidentally splitting the actual contents
            tmpColDelim = String.fromCharCode(11), // vertical tab character
            tmpRowDelim = String.fromCharCode(0), // null character

            // actual delimiter characters for CSV format
            colDelim = '","',
            rowDelim = '"\r\n"',

            // Grab text from table into CSV formatted string
            csv = '"' + $rows.map(function (i, row) {
                var $row = $(row),
                    $cols = $row.find('td');

                return $cols.map(function (j, col) {
                    var $col = $(col),
                        text = $col.text();

                    return text.replace('"', '""'); // escape double quotes

                }).get().join(tmpColDelim);

            }).get().join(tmpRowDelim)
                .split(tmpRowDelim).join(rowDelim)
                .split(tmpColDelim).join(colDelim) + '"',

            // Data URI
            csvData = 'data:application/csv;charset=utf-8,' + encodeURIComponent(csv);

        $(this)
            .attr({
            'download': filename,
                'href': csvData,
                'target': '_blank'
        });
    }

现在,要启动此功能,您可以使用:

$('.getfile').click(
            function() { 
    exportTableToCSV.apply(this, [$('#thetable'), 'filename.csv']);
             });

其中“getfile”应该是分配给按钮的类,您要在其中添加号召性用语。 (单击此按钮时,将出现下载弹出窗口)并且“thetable”应该是分配给要下载的表的 ID。

您还可以更改为自定义文件名以在代码中下载。

你可以用数组和正则表达式来做到这一点......见下文

$csv = array();
preg_match('/<table(>| [^>]*>)(.*?)<\/table( |>)/is',$table,$b);
$table = $b[2];
preg_match_all('/<tr(>| [^>]*>)(.*?)<\/tr( |>)/is',$table,$b);
$rows = $b[2];
foreach ($rows as $row) {
    //cycle through each row
    if(preg_match('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row)) {
        //match for table headers
        preg_match_all('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row,$b);
        $csv[] = strip_tags(implode(',',$b[2]));
    } elseif(preg_match('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row)) {
        //match for table cells
        preg_match_all('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row,$b);
        $csv[] = strip_tags(implode(',',$b[2]));
    }
}
$csv = implode("\n", $csv);
var_dump($csv);

然后您可以使用file_put_contents()将 csv 字符串写入文件..

为了扩展已接受的答案,我这样做了,这允许我按类名忽略列并处理空白行/列。

您可以使用 str_get_html http://simplehtmldom.sourceforge.net/ 只需包含它就可以了! :)

$html = str_get_html($html); // give this your HTML string

header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element) {
  $td = array();
  foreach( $element->find('th') as $row) {
    if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
      $td [] = $row->plaintext;
    }
  }
  if (!empty($td)) {
    fputcsv($fp, $td);
  }

  $td = array();
  foreach( $element->find('td') as $row) {
    if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
      $td [] = $row->plaintext;
    }
  }
  if (!empty($td)) {
    fputcsv($fp, $td);
  }
}

fclose($fp);
exit;

如果有人在使用 Baba 的答案,但对添加的额外空白感到头疼,这将起作用:

include "simple_html_dom.php";
$table = '<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>';

$html = str_get_html($table);   

$fileName="export.csv";
header('Content-type: application/ms-excel');
header("Content-Disposition: attachment; filename=$fileName");

$fp = fopen("php://output", "w");
$csvString="";

$html = str_get_html(trim($table));
foreach($html->find('tr') as $element)
{

    $td = array();
    foreach( $element->find('th') as $row)
    {
        $row->plaintext="\"$row->plaintext\"";
        $td [] = $row->plaintext;
    }
    $td=array_filter($td);
    $csvString.=implode(",", $td);

    $td = array();
    foreach( $element->find('td') as $row)
    {
        $row->plaintext="\"$row->plaintext\"";
        $td [] = $row->plaintext;
    }
    $td=array_filter($td);
    $csvString.=implode(",", $td)."\n";
}
echo $csvString;
fclose($fp);
exit;

}

我根据在这个线程上找到的代码改编了一个简单的类,现在处理colspanrowspan 没有经过大量测试,我相信它可以优化。

用法:

require_once('table2csv.php');

$table = '<table border="1">
    <tr>
    <th colspan=2>Header 1</th>
    </tr>
    <tr>
    <td>row 1, cell 1</td>
    <td>row 1, cell 2</td>
    </tr>
    <tr>
    <td>row 2, cell 1</td>
    <td>row 2, cell 2</td>
    </tr>
    <tr>
    <td rowspan=2>top left row</td>
    <td>top right row</td>
    </tr>
    <tr>
    <td>bottom right</td>
    </tr>
    </table>';

table2csv($table,"sample.csv",true);

table2csv.php

<?php

    //download @ http://simplehtmldom.sourceforge.net/
    require_once('simple_html_dom.php');
    $repeatContentIntoSpannedCells = false;


    //--------------------------------------------------------------------------------------------------------------------

    function table2csv($rawHTML,$filename,$repeatContent) {

        //get rid of sups - they mess up the wmus
        for ($i=1; $i <= 20; $i++) { 
            $rawHTML = str_replace("<sup>".$i."</sup>", "", $rawHTML);
        }

        global $repeatContentIntoSpannedCells;

        $html = str_get_html(trim($rawHTML));
        $repeatContentIntoSpannedCells = $repeatContent;

        //we need to pre-initialize the array based on the size of the table (how many rows vs how many columns)

        //counting rows is easy
        $rowCount = count($html->find('tr'));

        //column counting is a bit trickier, we have to iterate through the rows and basically pull out the max found
        $colCount = 0;
        foreach ($html->find('tr') as $element) {

            $tempColCount = 0;

            foreach ($element->find('th') as $cell) {
                $tempColCount++;
            }

            if ($tempColCount == 0) {
                foreach ($element->find('td') as $cell) {
                    $tempColCount++;
                }
            }

            if ($tempColCount > $colCount) $colCount = $tempColCount;
        }

        $mdTable = array();

        for ($i=0; $i < $rowCount; $i++) { 
            array_push($mdTable, array_fill(0, $colCount, NULL));
        }

        //////////done predefining array

        $rowPos = 0;
        $fp = fopen($filename, "w");

        foreach ($html->find('tr') as $element) {

            $colPos = 0;

            foreach ($element->find('th') as $cell) {
                if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                    parseCell($cell,$mdTable,$rowPos,$colPos);
                }
                $colPos++;
            }

            foreach ($element->find('td') as $cell) {
                if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                    parseCell($cell,$mdTable,$rowPos,$colPos);
                }
                $colPos++;
            }   

            $rowPos++;
        }


        foreach ($mdTable as $key => $row) {

            //clean the data
            array_walk($row, "cleanCell");
            fputcsv($fp, $row);
        }
    }


    function cleanCell(&$contents,$key) {

        $contents = trim($contents);

        //get rid of pesky &nbsp's (aka: non-breaking spaces)
        $contents = trim($contents,chr(0xC2).chr(0xA0));
        $contents = str_replace("&nbsp;", "", $contents);
    }


    function parseCell(&$cell,&$mdTable,&$rowPos,&$colPos) {

        global $repeatContentIntoSpannedCells;

        //if data has already been set into the cell, skip it
        while (isset($mdTable[$rowPos][$colPos])) {
            $colPos++;
        }

        $mdTable[$rowPos][$colPos] = $cell->plaintext;

        if (isset($cell->rowspan)) {

            for ($i=1; $i <= ($cell->rowspan)-1; $i++) {
                $mdTable[$rowPos+$i][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
            }
        }

        if (isset($cell->colspan)) {

            for ($i=1; $i <= ($cell->colspan)-1; $i++) {

                $colPos++;
                $mdTable[$rowPos][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
            }
        }
    }

?>

爸爸的回答包含额外的空间。 所以,我将代码更新为:

 include "simple_html_dom.php"; $table = '<table border="1"> <tr> <th>Header 1</th> <th>Header 2</th> </tr> <tr> <td>row 1, cell 1</td> <td>row 1, cell 2</td> </tr> <tr> <td>row 2, cell 1</td> <td>row 2, cell 2</td> </tr> </table>'; $html = str_get_html($table); header('Content-type: application/ms-excel'); header('Content-Disposition: attachment; filename=sample.csv'); $fp = fopen("php://output", "w"); foreach($html->find('tr') as $element) { $td = array(); foreach( $element->find('th') as $row) { $td [] = $row->plaintext; } foreach( $element->find('td') as $row) { $td [] = $row->plaintext; } fputcsv($fp, $td); } fclose($fp);

我之前从未真正做过这个,但我发现这个教程包含了源文件,而且它很简单,很容易理解:

http://davidvielmetter.com/tricks/howto-convert-an-html-table-to-csv-using-php/

假设 out_str 有你的 html 表格数据

$csv = $out_str;
        $csv = str_replace("<table class='gradienttable'>","",$csv);
        $csv = str_replace("</table>","",$csv);
        $csv = str_replace("</td><td>",",",$csv);
        $csv = str_replace("<td>","",$csv);
        $csv = str_replace("</td>","",$csv);
        $csv = str_replace("</font>","",$csv);
        $csv = str_replace("</tr>","",$csv);
        $csv = str_replace("<tr>","\n",$csv);

从文件中删除任何 CSS

        $csv = str_replace("<font color='yellow'>","",$csv);
        $csv = str_replace("<font color='red'>","",$csv);
        $csv = str_replace("<font color='green'>","",$csv);
        $csv = str_replace("</th><th>",",",$csv);
        $csv = str_replace("<th>","",$csv);
        $csv = str_replace("</th>","",$csv);

        file_put_contents('currentFile.csv',$csv);

将文件 currentFile.csv 输出给用户

希望能帮助到你!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM