简体   繁体   English

从PHP中的大型CSV文件中读取多个列

[英]Reading multiple columns from large CSV files in PHP

I need to read two columns from a large CSV file. 我需要从大型CSV文件中读取两列。 The CSV has multiple columns and can sometimes have following properties: CSV有多列,有时可以具有以下属性:

  1. ~25,000 lines ~25,000行
  2. Contain spaces and blank rows 包含空格和空行
  3. Be uneven (some columns longer than others) 不均匀(某些列长于其他列)

在此输入图像描述

In the example CSV file above, I would be only interested in the codes in the "Buy" and "Sell" columns (columns A and D). 在上面的示例CSV文件中,我只对“买入”和“卖出”列(A列和D列)中的代码感兴趣。

I have written the following code (warning: it's not very elegant) to iterate over all rows and read only the columns I require. 我编写了以下代码(警告:它不是很优雅)迭代所有行并只读取我需要的列。 I create strings as inputs for 1 large MYSQL query (as opposed to running many small queries). 我创建字符串作为1个大型MYSQL查询的输入(而不是运行许多小查询)。

<?php 
//Increase the allowed execution time 
set_time_limit(0);
ini_set('memory_limit','256M');
ini_set('max_execution_time', 0);     

//Set to detect the ending of CSV files
ini_set('auto_detect_line_endings', true);

$file = "test.csv";

$buy = $sold = ""; //Initialize empty strings

if (($handle = @fopen($file, "r")) !== FALSE) {

while (($pieces = fgetcsv($handle, 100, ",")) !== FALSE) {       

if ( ! empty($pieces[0]) ) {
    $buy .= $pieces[0] ." ";
} 

if ( ! empty($pieces[3]) ) {
   $sold .= $pieces[3] ." ";
} 
}

echo "Buy ". $buy ."<br>"; //Do something with strings...
echo "Sold ". $sold ."<br>";

//Close the file
fclose($handle);  
}

?> ?>

My question is: is this the best way to perform such a task? 我的问题是:这是执行此类任务的最佳方式吗? The code works for smaller test files, but are there short comings I've overlooked in iterating over the CSV file like this? 该代码适用于较小的测试文件,但是我在这样迭代CSV文件时忽略了一些缺点吗?

First, reading any large files is memory consuming if you store them in variables. 首先,如果将它们存储在变量中,则读取任何大文件都会占用大量内存。 You may check out reading large files(more than 4GB in unix) 您可以查看读取大文件(unix中超过4GB)

Secondly, you can output the $buy & $sold on the while loop which might be more memory efficient in the way that those two variables are not saved on the memory. 其次,您可以在while循环中输出$ buy&$ sold,这可能会更有效,因为这两个变量没有保存在内存中。

Lastly, Use file seek method in php fseek documentation 最后,在php fseek文档中使用文件搜索方法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM