简体   繁体   中英

In Perl, how do I remove consecutive pairs of characters from a string?

I have a string containing pairs of characters and I would like to replace each run by a single character. How can I do that?


This is a question from the official FAQ . We're importing the perlfaq to Stack Overflow .

(This is the official perlfaq answer , minus any subsequent edits)

You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in (.) . The memory parentheses store the matched character in the back-reference \\g1 and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1 .

s/(.)\g1/$1/g; # 5.10 or later
s/(.)\1/$1/g;  # earlier versions

We can also use the transliteration operator, tr/// . In this example, the search list side of our tr/// contains nothing, but the c option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the s option squashes duplicated and consecutive characters in the string so a character does not show up next to itself

my $str = 'Haarlem';   # in the Netherlands
$str =~ tr///cs;       # Now Harlem, like in New York
$str=~ s/(.)\1+/$1/g;

This is the answer from perlfaq4 from the last stable release:

How do I remove consecutive pairs of characters?

(contributed by brian d foy)

You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in (.). The memory parentheses store the matched character in the back-reference \\1 and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1.

s/(.)\1/$1/g;

We can also use the transliteration operator, tr///. In this example, the search list side of our tr/// contains nothing, but the c option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the s option squashes duplicated and consecutive characters in the string so a character does not show up next to itself

my $str = 'Haarlem';   # in the Netherlands
$str =~ tr///cs;       # Now Harlem, like in New York

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM