Read chunks of data in Perl

Question

What is a good way in Perl to split a line into pieces of varying length, when there is no delimiter I can use. My data is organized by column length, so the first variable is in positions 1-4, the second variable is positions 5-15, etc. There are many variables each with different lengths.

Put another way, is there some way to use the split function based on the position in the string, not a matched expression?

Thanks.

Answer 1

Yes there is. The unpack function is well-suited to dealing with fixed-width records.

Example

my $record = "1234ABCDEFGHIJK";
my @fields = unpack 'A4A11', $record;  # 1st field is 4 chars long, 2nd is 11

print "@fields";                       # Prints '1234 ABCDEFGHIJK'

The first argument is the template, which tells unpack where the fields begin and end. The second argument tells it which string to unpack.

unpack can also be told to ignore character positions in a string by specifying null bytes, x . The template 'A4x2A9' could be used to ignore the "AB" in the example above.

See perldoc -f pack and perldoc perlpacktut for in-depth details and examples.

Answer 2

Instead of using split , try the old-school substr method:

my $first = substr($input, 0, 4);
my $second = substr($input, 5, 10);
# etc...

(I like the unpack method too, but substr is easier to write without consulting the documentation, if you're only parsing out a few fields.)

Answer 3

You could use the substr() function to extract data by offset:

$first = substr($line, 0, 4);
$second = substr($line, 4, 11);

Another option is to use a regular expression:

($first, $second) = ($line =~ /(.{4})(.{11})/);

Read chunks of data in Perl

Question

3 answers

solution1
25 ACCPTED 2010-06-29 20:31:42

Example

solution2
6 2010-06-29 20:32:24

solution3
4 2010-06-29 20:35:17

Read chunks of data in Perl

Question

3 answers

solution1 25 ACCPTED 2010-06-29 20:31:42

Example

solution2 6 2010-06-29 20:32:24

solution3 4 2010-06-29 20:35:17

solution1
25 ACCPTED 2010-06-29 20:31:42

solution2
6 2010-06-29 20:32:24

solution3
4 2010-06-29 20:35:17