[英]Split line with perl
title: Football team: Real Madrid stadium: Santiago Bernabeu players: Zinédine Zidane, Ronaldo, Luís Figo, Roberto Carlos, Raúl personnel: José Mourinho (head coach) Aitor Karanka (assistant coach (es))
如何用perl拆分:
title: Football
team: Real Madrid
stadium: Santiago Bernabeu
players: Zinédine Zidane Ronaldo Luís Figo Roberto Carlos Raúl
personnel: José Mourinho (head coach) Aitor Karanka (assistant coach (es))
使用先行斷言:
say for split /(?=\w+:)/, $real_madrid_string;
產量
title: Football
team: Real Madrid
stadium: Santiago Bernabeu
players: Zinédine Zidane Ronaldo Luís Figo Roberto Carlos Raúl
personnel: José Mourinho (head coach) Aitor Karanka (assistant coach (es))
這應該做到這一點。 line.txt包含“標題:足球隊:皇家馬德里體育場:聖地亞哥伯納烏球員:ZinédineZidane,羅納爾多,路易斯菲戈,羅伯托卡洛斯,勞爾人員:JoséMourinho(主教練)Aitor Karanka(助理教練)
#!/usr/bin/perl
use strict;
use warnings;
my $fn="./line.txt";
open(IN,$fn);
my @lines=<IN>;
my %hash;
my $hashKey;
foreach my $line (@lines){
$line=~s/\n//g;
my @split1=split(" +",$line);
foreach my $split (@split1){
if($split=~m/:$/){
$hashKey=$split;
}else{
if(defined($hash{$hashKey})){
$hash{$hashKey}=$hash{$hashKey}.$split." ";
}else{
$hash{$hashKey}=$split." ";
}
}
}
}
close(IN);
foreach my $key (keys %hash){
print $key.":".$hash{$key}."\n";
}
與許多人在他們的答案中所說的相反,你不需要超前(除了正則表達式之外),你只需要捕獲部分分隔符,如下所示:
my @hash_fields = grep { length; } split /\s*(\w+):\s*/;
我的完整解決方案如下
my %handlers
= ( players => sub { return [ grep { length; } split /\s*,\s*/, shift ]; }
, personnel => sub {
my $value = shift;
my %personnel;
# Using recursive regex for nested parens
while ( $value =~ m/([^(]*)([(](?:[^()]+|(?2))*[)])/g ) {
my ( $name, $role ) = ( $1, $2 );
$role =~ s/^\s*[(]\s*//;
$role =~ s/\s*[)]\s*$//;
$name =~ s/^\s+//;
$name =~ s/\s+$//;
$personnel{ $role } = $name;
}
return \%personnel;
}
);
my %hash = grep { length; } split /(?:^|\s+)(\w+):\s+/, <DATA>;
foreach my $field ( keys %handlers ) {
$hash{ $field } = $handlers{ $field }->( $hash{ $field } );
}
轉儲看起來像這樣:
%hash: {
personnel => {
'assistant coach (es)' => 'Aitor Karanka',
'head coach' => 'José Mourinho'
},
players => [
'Zinédine Zidane',
'Ronaldo',
'Luís Figo',
'Roberto Carlos',
'Raúl'
],
stadium => 'Santiago Bernabeu',
team => 'Real Madrid',
title => 'Football'
}
最好的方法是使用零寬度前瞻使用split
命令:
$string = "title: Football team: Real Madrid stadium: Santiago Bernabeu players: Zinédine Zidane, Ronaldo, Luís Figo, Roberto Carlos, Raúl personnel: José Mourinho (head coach) Aitor Karanka (assistant coach (es))";
@split_string = split /(?=\b\w+:)/, $string;
$string = "title: Football team: Real Madrid stadium: Santiago Bernabeu players: Zinédine Zidane, Ronaldo, Luís Figo, Roberto Carlos, Raúl personnel: José Mourinho (head coach) Aitor Karanka (assistant coach (es))";
@words = split(' ', $string);
@lines = undef;
@line = shift(@words);
foreach $word (@words)
{
if ($word =~ /:/)
{
push(@lines, join(' ', @line));
@line = undef;
}
else
{
push(@line, $word);
}
}
print join("\n", @lines);
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.