使用Perl Regex提取第n次出现

Question

I am trying to find the best way to parse a line that looks like this: 我试图找到解析这样一行的最佳方法：

Explicit|00|11|Hello World|12 3 134||and|blah|blah|blah

I just want to extract the stuff between the 6th and 7th vertical bar | 我只想提取第6和第7纵杆之间的东西
I tried something like 我试过类似的东西

if ($line =~ /^(.*\|){6}(\w*)\|/ ) {  
    print $2;  
}

The problem is that the first part seems to be matching the longest sequence possible because of .* , perhaps there is something different I should be using. 问题是第一部分似乎匹配可能的最长序列因为.* ，也许我应该使用不同的东西。 Between the vertical bars, there are alphanumeric characters, spaces and punctuation. 在垂直条之间，有字母数字字符，空格和标点符号。

Should I be matching the shortest between them? 我应该匹配它们之间的最短距离吗？

Answer 1

You can use .*? 你可以使用.*? instead, to modify the * to prefer fewer to more times. 相反，修改*以更喜欢更少次数。

This could still match in the wrong place if the field you want has non-word characters; 如果您想要的字段具有非单词字符，则仍可能在错误的位置匹配; to prevent this you can either explicitly say anything-but-| 为了防止这种情况你可以明确地说出任何东西 - 但是 - ( ([^|]*\\|){6} ) or disable backtracking for that part ( ((?>.*?\\|)){6} ). （ ([^|]*\\|){6} ）或禁用该部分的回溯((?>.*?\\|)){6} ）。

Or you could just use split: 或者您可以使用拆分：

if ( my $seventh = ( split /\|/, $line, 8 )[6] ) {
    print $seventh;
}

(the 8 is optional and tells split not to bother trying anymore after reaching the 7th |) （8是可选的，告诉分裂在到达第7个之后不再费心去尝试|）

Answer 2

Use split. 使用拆分。 Something like my @fields = split /\\|/, $str should work. 像my @fields = split /\\|/, $str应该my @fields = split /\\|/, $str 。 Then you just index the field you're interested in (also empty fields will be preserved). 然后，您只需索引您感兴趣的字段（也将保留空字段）。 | | must be escaped as it's regexp operator. 必须转义，因为它是regexp运算符。

使用Perl Regex提取第n次出现

问题描述

2 个解决方案

解决方案1
8 已采纳 2010-12-19 08:13:05

解决方案2
3 2010-12-19 08:18:15

使用Perl Regex提取第n次出现

问题描述

2 个解决方案

解决方案1 8 已采纳 2010-12-19 08:13:05

解决方案2 3 2010-12-19 08:18:15

解决方案1
8 已采纳 2010-12-19 08:13:05

解决方案2
3 2010-12-19 08:18:15