简体   繁体   English

PHP和CSV:删除或忽略以不同长度行的字符开头的行

[英]PHP & CSV: Delete or ignore lines that start with a character for different length rows

I have CSV file that uses Keys in the first column, and each has a different row length. 我有在第一列中使用键的CSV文件,每个文件的行长都不同。

Top of the file, the header row starts with 'M', and then rows starting with 'C' and then 'A' alternate throughout the file like this. 在文件顶部,标题行以“ M”开头,然后以“ C”开头,然后以“ A”开头的行在整个文件中这样交替。

M   P395, 177   177, 13/03/13, , , , , , , FALSE,   1904.2, , , , , , , , , , , , , , 
C   QTM0039326, X6  100013424,  Example, , Example  WA  6754    AU, FALSE,  TRUE    FALSE, N,   FALSE, FALSE, FALSE Example Example Brisbane,   Brisbane City   QLD 4000    AU, , , , , , Example   TRACKADV
A   0.1, , , FALSE  FALSE   0, , , , , , , , , , , , , , , , , , , , , 
C   QTM0039226  7021130 X6  100013427,  Example, , Example  NSW 2795    AU  427181931   FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV
A   4.1 0   0   0, ARTICLE CONTAINS CONSUMER GOOD(S)    FALSE   FALSE   0   0,  FALSE   FALSE   FALSE   FALSE   FALSE, FALSE, , , , , , , , , , , , , , , , 
C   QTM0039214  7021130 X6  100013440   Example, Example, , Example QLD 4502    AU  32858429    FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV
A   1.35    0   0   0, ARTICLE CONTAINS CONSUMER GOOD(S)    FALSE   FALSE   0   0,  FALSE   FALSE   FALSE   FALSE   FALSE, FALSE, , , , , , , , , , , , , , , , 
C   QTM0039296  7021130 X6  100013349, Metro Auto Spares    Example, , Example  TAS 7310    AU  427236691   FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV
A   5.25    0   0   0, ARTICLE CONTAINS CONSUMER GOOD(S)    FALSE   FALSE   0   0,  FALSE   FALSE   FALSE   FALSE   FALSE, FALSE, , , , , , , , , , , , , , , , 
C   QTM0039300  7021130 X6  100013345,  Example, , Example  QLD 4303    AU  402131430   FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV
A   0.6 0   0   0, ARTICLE CONTAINS CONSUMER GOOD(S)    FALSE   FALSE   0   0,  FALSE   FALSE   FALSE   FALSE   FALSE, FALSE, , , , , , , , , , , , , , , , 
C   QTM0039242  7021130 X6  100008683,  Example, , Example  SA  5034    AU  403468706   FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV
A   0.6 0   0   0, ARTICLE CONTAINS CONSUMER GOOD(S)    FALSE   FALSE   0   0,  FALSE   FALSE   FALSE   FALSE   FALSE, FALSE, , , , , , , , , , , , , , , , 
C   QTM0039065  7021130 X6  100013177,  Example, , Example  VIC 3136    AU  61397233661 FALSE,  TRUE    FALSE, N    0, FALSE, FALSE, FALSE  Example Example, , Brisbane QLD 4000    AU  Example Example Example, , Example  QLD 4211    AU, Example TRACKADV

I only need data from rows C. Is there a quick way to either delete all rows starting with 'M' and 'A', or, to ignore these rows in a script? 我只需要C行中的数据。是否有一种快速的方法可以删除以'M'和'A'开头的所有行,或者在脚本中忽略这些行?

If I remove all of the M & A rows manually from the target file I can use this to get the data I want, but because the row lengths are different I'm having trouble using this method regardless of new row character. 如果我从目标文件中手动删除所有并购行,则可以使用它来获取所需的数据,但是由于行长不同,因此无论新行字符如何,我都很难使用此方法。

if (($handle = fopen("test.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",", "\n")) !== FALSE) 
    {
        echo $data[0] . " - ". $data[1] . " - ". $data[4] . "<br/><hr>" ;
    }

    fclose($handle);
}

One way would be to 一种方法是

preg_replace_all('#^([MCA])\s#im',"$1,",$file);

and then parse it as a normal CSV 然后将其解析为普通CSV

while($line=fgetcsv($f))
{
    if($line[0]=="M" || $line[0]=="A") continue;
    /* ... */
}

If you're using PHP >=5.3, you can use the other way: str_getcsv 如果您使用的是PHP> = 5.3,则可以使用其他方式: str_getcsv

while($line=fgets($f))
{
    if(preg_match('#^[MA]\s#i',$line)) continue;
    $line=str_getcsv(substr($line,2));
    /* ... */
}

One downside of this method is if your CSV contains something like "multiline \\n column" , it will break. 这种方法的缺点是,如果您的CSV包含"multiline \\n column" ,它将损坏。

And about the variable-length problem, you may be interested to know that fgetcsv only need one parameter to work (PHP >=5); 关于可变长度问题,您可能想知道fgetcsv只需要一个参数即可工作(PHP> = 5); with all the other parameters left to default, it will read every CSV-line in full length. 将所有其他参数保留为默认值,它将读取每个CSV行的完整长度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM