Perl-从CSV文件读取特定行

Question

I'm looking to read a certain "category" from a .csv file that looks something like this: 我想从看起来像这样的.csv文件中读取某个“类别”：

Category 1, header1, header2, header3,...,
          , data, data, data,...,
          , data, data, data,...,
          , data, data, data,...,
Category 2, header1, header2, header3,...,
          , data, data, data,...,
          , data, data, data,...,
          , data, data, data,...,
Category 3, header1, header2, header3,...,
          , data, data, data,...,
          , data, data, data,...,
          , data, data, data,...

Let's say I wanted to print only the data from a specific "category"... how would I go about doing this? 假设我只想打印特定“类别”中的数据...我将如何去做？

ie: I want to print Category 2 data, the output should look like: 即：我要打印类别2数据，输出应如下所示：

Category 2, header1, header2, header3,...,
          , data, data, data,...,
          , data, data, data,...,
          , data, data, data,...

Answer 1

Unless your data includes quoted fields, like a,b,c,"complicated field, quoted",e,f,g there is no advantage in using Text::CSV over a simple split /,/ . 除非您的数据包含带引号的字段（如a,b,c,"complicated field, quoted",e,f,g ，否则使用Text::CSV而不是简单的split /,/没有优势。

This example categorizes the data into a hash that you can access simply and directly. 本示例将数据分类为一个哈希，您可以直接直接访问该哈希。 I have used Data::Dump only to show the contents of the resulting data structure. 我仅使用Data::Dump来显示结果数据结构的内容。

use strict;
use warnings;
use autodie;

open my $fh, '<', 'mydata.csv';

my $category;
my %data;
while (<$fh>) {
  chomp;
  my @data = split /,/;
  my $cat = shift @data;
  $category = $cat if $cat =~ /\S/;
  push @{ $data{$category} }, \@data;
}

use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \%data;

output 产量

{
  "Category 1" => [
                    [" header1", " header2", " header3", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                  ],
  "Category 2" => [
                    [" header1", " header2", " header3", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                  ],
  "Category 3" => [
                    [" header1", " header2", " header3", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                    [" data", " data", " data", "..."],
                  ],
}

Update 更新

If all you want is to separate a given section of the file then there is no need to put it into a hash. 如果您想要的只是分隔文件的给定部分，则无需将其放入哈希中。 This program will do what you want. 该程序将执行您想要的操作。

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

my ($file, $wanted) = @ARGV;

open my $fh, '<', $file;

my $category;

while (<$fh>) {
  my ($cat) = /\A([^,]*)/;
  $category = $cat if $cat =~ /\S/;
  print if $category eq $wanted;
}

Run it like this on the command line 在命令行上像这样运行

get_category.pl mydata.csv 'Category 2' > cat2.csv

output 产量

Category 2, header1, header2, header3,...,
          , data, data, data,...,
          , data, data, data,...,
          , data, data, data,...

Answer 2

如果该输出绝对是您想要的，那么您可以使用perl一线执行此操作：

perl -ne "$p = 0 if /^Category/;$p = 1 if /^Category 2/;print if $p;" myfile.csv

Perl-从CSV文件读取特定行

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-06-04 15:17:23

解决方案2
0 2014-06-04 15:33:50

Perl-从CSV文件读取特定行

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-06-04 15:17:23

解决方案2 0 2014-06-04 15:33:50

解决方案1
1 已采纳 2014-06-04 15:17:23

解决方案2
0 2014-06-04 15:33:50