[英]Extract nth occurrence with Perl Regex
I am trying to find the best way to parse a line that looks like this: 我试图找到解析这样一行的最佳方法:
Explicit|00|11|Hello World|12 3 134||and|blah|blah|blah
I just want to extract the stuff between the 6th and 7th vertical bar | 我只想提取第6和第7纵杆之间的东西
I tried something like 我试过类似的东西
if ($line =~ /^(.*\|){6}(\w*)\|/ ) {
print $2;
}
The problem is that the first part seems to be matching the longest sequence possible because of .*
, perhaps there is something different I should be using. 问题是第一部分似乎匹配可能的最长序列因为.*
,也许我应该使用不同的东西。 Between the vertical bars, there are alphanumeric characters, spaces and punctuation. 在垂直条之间,有字母数字字符,空格和标点符号。
Should I be matching the shortest between them? 我应该匹配它们之间的最短距离吗?
You can use .*?
你可以使用.*?
instead, to modify the *
to prefer fewer to more times. 相反,修改*
以更喜欢更少次数。
This could still match in the wrong place if the field you want has non-word characters; 如果您想要的字段具有非单词字符,则仍可能在错误的位置匹配; to prevent this you can either explicitly say anything-but-| 为了防止这种情况你可以明确地说出任何东西 - 但是 - ( ([^|]*\\|){6}
) or disable backtracking for that part ( ((?>.*?\\|)){6}
). ( ([^|]*\\|){6}
)或禁用该部分的回溯((?>.*?\\|)){6}
)。
Or you could just use split: 或者您可以使用拆分:
if ( my $seventh = ( split /\|/, $line, 8 )[6] ) {
print $seventh;
}
(the 8 is optional and tells split not to bother trying anymore after reaching the 7th |) (8是可选的,告诉分裂在到达第7个之后不再费心去尝试|)
Use split. 使用拆分。 Something like my @fields = split /\\|/, $str
should work. 像my @fields = split /\\|/, $str
应该my @fields = split /\\|/, $str
。 Then you just index the field you're interested in (also empty fields will be preserved). 然后,您只需索引您感兴趣的字段(也将保留空字段)。 | | must be escaped as it's regexp operator. 必须转义,因为它是regexp运算符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.