正则表达式替换在perl中如何工作？

Question

I have tried the remove duplicates from the strings, "a","b","b","a","c" after removing the result is "a","b","c", . 在删除结果为"a","b","c",之后"a","b","b","a","c"我尝试了从字符串"a","b","b","a","c"删除重复项。 I have achieved this, but I have a doubt about working of regex substitution 我已经做到了，但是我对正则表达式替换的工作有疑问

use warnings;
use strict;
my $s = q+"a","b","b","a","c"+;

 $s=~s/ ("\w"),? / ($s=~s|($1)||g)?"$1,":"" /xge;
#^                   ^
#|                   Consider this as s2
#Consider this as s1

print "\n$s\n\n";

s1 value contain string as "a","b","b","a","c" s1值包含字符串"a","b","b","a","c"

Step 1 第1步

After substitution: 替换后：

Guess, what is the data contain s1 variable from the following "a","b","b","c" or "a","b","b","a","c" or ,"b","b",,"c" data.? 猜猜是什么数据包含来自以下"a","b","b","c"或"a","b","b","a","c"或,"b","b",,"c" s1变量,"b","b",,"c"数据。

I have run the regex with eval grouping 我已经通过评估分组运行了正则表达式

$s=~s/ ("\w"),? (?{print "$s\n"})/ ($s=~s|($1)||g)?"$1,":"" /xge;

The result is 结果是

"a","b","b","a","c"
,"b","b",,"c"  #This is from after substitution
,,,,"c"
,,,,"c"
,,,,"c"

Now my dobut is s2 variable also $s why it is not concatenated with s1 , it means at the second step the result should be "a","b","b","c" (All the string "a" is replaced with empty and a is added in the $s ).? 现在我的dobut是s2变量，也就是$s为什么不与s1连接，这意味着在第二步结果应该是"a","b","b","c" （所有字符串"a"是替换为空，并在$s添加a ）。

Edited 已编辑

The result from the eval grouping is (?{print $s}) 评估分组的结果是(?{print $s})

"a","b","b","a","c"
,"b","b",,"c" 
,,,,"c"
,,,,"c"
,,,,"c"

After the substitution line I printed the $s variable it is giving "a","b","c" , How this output is coming.? 在替换行之后，我打印了$s变量，它给出的是"a","b","c" ，输出结果如何？

Answer 1

A regex is (in my opinion) the wrong tool to use here. 正则表达式（在我看来）是在此使用的错误工具。 I would 我会

split the string on commas 用逗号split字符串
remove duplicates from the list returned by split 从split返回的列表中删除重复项
join the list back into a string join列表回字符串

Like this: 像这样：

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

my $str = q["a","b","b","a","c"];

my %seen;

$str = join ',',
       grep { ! $seen{$_}++ }
       split /,/, $str;

say $str;

Answer 2

The proper solution to this is split, filter, rejoin as @Dave Cross has already demonstrated. 正确的解决方案是拆分，过滤，重新加入，如@Dave Cross所示。

... ...

However, the following regex solution does work and hopefully demonstrates why Dave's solution is superior 但是，以下正则表达式解决方案确实有效，并有望说明Dave解决方案为何优越

#!/usr/bin/env perl

use v5.10;
use strict;
use warnings;

my $str = q{"a","b","b","a","c"};

1 while $str =~ s{
    \A
    (?: (?&element) , )*
    ( (?&element) )           # Capture in \1
    (?: , (?&element) )*
    \K
    ,
    \1                        # Remove the duplicate along with preceding comma
    (?= \z | , )

    (?(DEFINE)
        (?<element>
            "
            \w
            "
        )
    )
}{}xg;

say $str;

Outputs: 输出：

"a","b","c"

正则表达式替换在perl中如何工作？

问题描述

2 个解决方案

解决方案1
6 2017-09-13 11:57:13

解决方案2
2 2017-09-13 15:06:24

正则表达式替换在perl中如何工作？

问题描述

2 个解决方案

解决方案1 6 2017-09-13 11:57:13

解决方案2 2 2017-09-13 15:06:24

解决方案1
6 2017-09-13 11:57:13

解决方案2
2 2017-09-13 15:06:24