简体   繁体   English

Spreadsheet :: ParseExcel建立数组或哈希

[英]Spreadsheet::ParseExcel building an array or hash

I am new to Spreadsheet::ParseExcel . 我是Spreadsheet::ParseExcel新手。 I have a space-delimited file which I opened in Microsoft Excel and saved it as a XLS file. 我有一个用空格分隔的文件,该文件已在Microsoft Excel中打开并将其另存为XLS文件。

I installed Spreadsheet::ParseExcel and used the example code in documentation to print the contents of the file. 我安装了Spreadsheet::ParseExcel并使用文档中的示例代码来打印文件的内容。 My objective is to build an array of some of the data to write to a database. 我的目标是构建一些数据的数组以写入数据库。 I just need a little help building the array -- writing to a database I'll figure out later. 我只需要一点帮助即可构建阵列-写入数据库,稍后我会弄清楚。

I'm having a hard time understanding this module -- I did read the documentation, but because of my inexperience I'm unable to understand it. 我很难理解这个模块-我确实阅读了文档,但是由于我的经验不足,我无法理解它。

Below is the code I'm using for the output. 以下是我用于输出的代码。

#!/usr/bin/perl

use warnings;
use strict;

use Data::Dumper;
use Spreadsheet::ParseExcel;

my $parser   = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse( 'file.xls' );

if ( !defined $workbook ) {
    die $parser->error(), ".\n";
}

for my $worksheet ( $workbook->worksheets() ) {

    my ( $row_min, $row_max ) = $worksheet->row_range();
    my ( $col_min, $col_max ) = $worksheet->col_range();

    for my $row ( $row_min .. $row_max ) {

        for my $col ( $col_min .. $col_max ) {

            my $cell = $worksheet->get_cell( $row, $col );
            next unless $cell;

            print "Row, Col    = ($row, $col)\n";
            print "Value       = ", $cell->value(),       "\n";
            print "Unformatted = ", $cell->unformatted(), "\n";
            print "\n";
        }
    }
}

And here is some of the output 这是一些输出

Row, Col    = (0, 0)
Value       = NewRecordFlag
Unformatted = NewRecordFlag

Row, Col    = (0, 1)
Value       = AgencyName
Unformatted = AgencyName

Row, Col    = (0, 2)
Value       = CredentialIdnt
Unformatted = CredentialIdnt

Row, Col    = (0, 3)
Value       = ContactIdnt
Unformatted = ContactIdnt

Row, Col    = (0, 4)
Value       = AgencyRegistryCardNumber
Unformatted = AgencyRegistryCardNumber

Row, Col    = (0, 5)
Value       = Description
Unformatted = Description

Row, Col    = (0, 6)
Value       = CredentialStatusDescription
Unformatted = CredentialStatusDescription

Row, Col    = (0, 7)
Value       = CredentialStatusDate
Unformatted = CredentialStatusDate

Row, Col    = (0, 8)
Value       = CredentialIssuedDate
Unformatted = CredentialIssuedDate

My objective is to build an array of CredentialIssuedDate , AgencyRegistryCardNumber , and AgencyName . 我的目标是建立一个CredentialIssuedDateAgencyRegistryCardNumberAgencyName Once I grasp the concept of doing that, I can go to town with this great module. 一旦掌握了这样做的概念,就可以带着这个很棒的模块去镇上了。

Here's a quick example of something that should work for you. 这是一个适合您的简单示例。 It builds an array @rows of arrays of the three field values you want for each worksheet, and displays each result using Data::Dumper . 它为每个工作表构建三个字段值的数组@rows数组,并使用Data::Dumper显示每个结果。 I haven't been able to test it, but it looks right and does compile 我尚未能够对其进行测试,但是它看起来正确并且可以编译

It starts by building a hash %headers that relates the column header strings to the column number, based on the first row in each worksheet. 首先,基于每个工作表中的第一行,构建一个将列标题字符串与列号相关联的哈希%headers

Then the second row onwards is processed, extracting the cells in the columns named in the @wanted array, and putting their values in the array @row , which is pushed onto @rows as each one is accumulated 然后处理第二行,提取@wanted数组中命名的列中的单元@wanted ,并将它们的值放在数组@row ,当每个单元累加时,将其推到@rows

#!/usr/bin/perl

use strict;
use warnings;

use Spreadsheet::ParseExcel;
use Data::Dumper;

my @wanted = qw/
    CredentialIssuedDate
    AgencyRegistryCardNumber
    AgencyName
/;

my $parser   = Spreadsheet::ParseExcel->new;
my $workbook = $parser->parse('file.xls');

if ( not defined $workbook ) {
    die $parser->error, ".\n";
}

for my $worksheet ( $workbook->worksheets ) {

    my ( $row_min, $row_max ) = $worksheet->row_range;
    my ( $col_min, $col_max ) = $worksheet->col_range;

    my %headers;

    for my $col ( $col_min, $col_max ) {
        my $header = $worksheet->get_cell($row_min, $col)->value;
        $headers{$header} = $col;
    }

    my @rows;

    for my $row ( $row_min + 1 .. $row_max ) {

        my @row;

        for my $name ( @wanted ) {
            my $col = $headers{$name};
            my $cell = $worksheet->get_cell($row, $col);
            push @row, $cell ? $cell->value : "";
        }

        push @rows, \@row;
    }

    print Dumper \@rows;
}

I was able to resolve this by using the Spreadsheet::BasicReadNamedCol module 我能够通过使用Spreadsheet::BasicReadNamedCol模块解决此问题

#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
use Spreadsheet::BasicReadNamedCol;

my $xlsFileName = 'shit.xls';
my @columnHeadings = (
'AgencyName',
'eMail',
'PhysicalAddress1',
'PhysicalAddress2'
);

my $ss = new Spreadsheet::BasicReadNamedCol($xlsFileName) ||
die "Could not open '$xlsFileName': $!";
$ss->setColumns(@columnHeadings);

# Print each row of the spreadsheet in the order defined in
# the columnHeadings array
my $row = 0;
while (my $data = $ss->getNextRow())
{
   $row++;
   print join('|', $row, @$data), "\n";
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM