简体   繁体   English

使用 PHP/MySQL 导出大型 CSV 数据的最佳方法是什么?

[英]What is the best approach to export large CSV data using PHP/MySQL?

I'm working on a project that I need to pull out data from database which contains almost 10k rows then export it to CSV.我正在处理一个项目,我需要从包含近 10k 行的数据库中提取数据,然后将其导出为 CSV。 I tried the normal method to download CSV but I'm always getting memory limit issue even if we already sets the memory_limit to 256MB.我尝试了下载 CSV 的正常方法,但即使我们已经将 memory_limit 设置为 256MB,我也总是遇到内存限制问题。

If any of you have experienced the same problem, please share your ideas on what is the best solutions or approach.如果你们中有人遇到过同样的问题,请分享您对最佳解决方案或方法的想法。

Really appreciate your thoughts guys.真的很感谢你的想法伙计们。

Here is my actual code:这是我的实际代码:

$filename = date('Ymd_His').'-export.csv';

//output the headers for the CSV file
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header('Content-Description: File Transfer');
header("Content-type: text/csv");
header("Content-Disposition: attachment; filename={$filename}");
header("Expires: 0");
header("Pragma: public");

//open the file stream
$fh = @fopen( 'php://output', 'w' );

$headerDisplayed = false;

foreach ( $formatted_arr_data_from_query as $data ) {
    // Add a header row if it hasn't been added yet -- using custom field keys from first array
    if ( !$headerDisplayed ) {
        fputcsv($fh, array_keys($ccsve_generate_value_arr));
        $headerDisplayed = true;
    }

    // Put the data from the new multi-dimensional array into the stream
    fputcsv($fh, $data);
}

// Close the file stream
fclose($fh);

If you really must do processing in PHP you'll need to use MYSQL's limit command to grab a subset of your data.如果您确实必须在 PHP 中进行处理,则需要使用 MYSQL 的 limit 命令来获取数据的子集。 Grab only a certain number of rows per time, write those out to the file and then grab the next set.每次只抓取一定数量的行,将它们写入文件,然后抓取下一组。

You may need to run unset() on a few of the variables inside your querying loop.您可能需要对查询循环中的一些变量运行 unset()。 The key is to not have too many huge arrarys in memory at once.关键是不要一次在内存中有太多庞大的数组。

If you're grabbing entire merged tables, sort them by insert date ascending such that the second grab will get any newer items.如果您要抓取整个合并表,请按插入日期升序对它们进行排序,以便第二次抓取将获得任何更新的项目。

As explained in this comment: https://stackoverflow.com/a/12041241/68567 using mysqldump is probably the best option.正如此评论中所解释的: https : //stackoverflow.com/a/12041241/68567使用 mysqldump 可能是最好的选择。 If needed you could even execute this via php with the exec() command as explained here: php exec() - mysqldump creates an empty file如果需要,您甚至可以使用 exec() 命令通过 php 执行此操作,如下所述: php exec() - mysqldump 创建一个空文件

  • Read each data row individually from the query result set从查询结果集中单独读取每个数据行
  • write directly to php://output直接写入 php://output
  • then read the next row, etc;然后阅读下一行,依此类推;

rather than building any large array or building the csv in memory而不是构建任何大型数组或在内存中构建 csv

SHORT DESCRIPTION: Export packs of several hundreds of lines to CSV reusing variables, so the memory pressure will remain low.简短描述:将数百行的包导出到 CSV 重用变量,因此内存压力将保持较低。 You cannot throw an entire mysql table in an array (and then to CSV file), that is the main problem您不能将整个 mysql 表放入数组中(然后放入 CSV 文件),这是主要问题

LONG DESCRIPTION: Try this to export a large table with column names (I used it, worked well, it can also be improved and compressed and optimised but .. later):详细说明:尝试使用此方法导出一个带有列名的大表(我使用过它,效果很好,它也可以改进、压缩和优化,但 .. 稍后):

  1. Open the CSV file (headers, fopen , etc)打开 CSV 文件(标题、 fopen等)
  2. Define an array with the column names and: fputcsv($f, $line, $delimiter);用列名定义一个数组: fputcsv($f, $line, $delimiter);
  3. Get a list of ids that you want (not entire rows, only ids): SELECT id FROM table WHERE condition ORDER BY your_desired_field ASC -> here you have $ids获取您想要的 id 列表(不是整行,只有 id): SELECT id FROM table WHERE condition ORDER BY your_desired_field ASC -> here you have $ids
  4. $perpage = 200; // how many lines you export to csv in a pack;
  5.  for ($z=0; $z < count($ids); $z += $perpage) { $q = "SELECT * FROM table WHERE same_condition ORDER BY your_desired_field ASC LIMIT " . $perpage . " OFFSET " . $z // important: use the same query as for retrieving ids, only add limit/offset. Advice: use ORDER BY, don't ignore it, even if you do not really need it; $x = [execute query q] for ($k=0; $k < count($x); $k++) { $line = array($x[$k]->id, $x[$k]->field1, $x[$k]->field2 ..); fputcsv($f, $line, $delimiter); } } // end for $z
  6. close the CSV关闭 CSV

So, you will loop through entire results table, get 200 rows and write them to CSV which will wait open until you write all the rows.因此,您将遍历整个结果表,获取 200 行并将它们写入 CSV,该 CSV 将等待打开,直到您写入所有行。 All the memory you need is for the 200 rows because you will re-write the variable.您需要的所有内存都用于 200 行,因为您将重新写入变量。 I am sure it can be done in a better way but took several hours for me and didn't find a solution;我相信它可以以更好的方式完成,但我花了几个小时并没有找到解决方案; also, it is slightly influenced by my architecture and demands of the app, this is why I chose this solution.此外,它受我的架构和应用程序需求的轻微影响,这就是我选择此解决方案的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM