简体   繁体   中英

Can perl replace multiple keywords with their own substitute word in one go?

Consider a textfile with the contents:

apple apple pear plum apple cherry pear apple  
cherry plum plum pear apple cherry pear pear apple plum

And consider the perl one-liner:

perl -pe "s/apple/green/g and s/pear/yellow/g and s/plum/blue/g and s/cherry/red/g" < fruits.txt

This replaces every fruit with its colour.
Now, could this be done in a single s///g somehow, instead of the above four?

I am also concerned about the order of the fruit words.
If my sample does not include "apple", none of the other replacements will complete. How should I fix that?

Please note: I want to keep the solution as a one-liner.
So defining hashes, reading in files and other solutions requiring many lines of perl code do not take me forward.

It is more of a curiosity rather than a life-or-death question a project would depend on. Just troubles me for some days now and thought a more experienced perl user out there could help with the solution in a heartbeat, or put me out of my misery by telling me straight that this cannot be done in perl the way I want.

Replace

perl -pe's/apple/green/g and s/pear/yellow/g and ...' fruits.txt

with

perl -pe's/apple/green/g; s/pear/yellow/g; ...' fruits.txt

However, the following is faster and doesn't have a problem with a=>bb=>c:

perl -pe'
   BEGIN {
      %subs=qw(apple green pear yellow plum blue cherry red);
      $re=join "|", map quotemeta, keys %subs;
      $re = qr/($re)/;
   }
   s/$re/$subs{$1}/g;
' fruits.txt

Other potential issues:

  • What if you want to replace apple but not apples ?
  • What if the hash has keys bee and beer ?

Both problems can be solved using suitable anchoring (eg $re = qr/\b($re)\b/ ). The second can also be solved by sorting the keys by decreasing length ( sort { length($b) <=> length($a) } keys %subs ).

(You can remove the line breaks I added for readability.)

perl -pe '%a=qw(apple green pear yellow plum blue cherry red);$b=join("|",keys %a);s/($b)/$a{$1}/g' < fruits.txt

perl -E 'my %h = qw(apple green foo bar); say "apple foo" =~ s/(apple|foo)/$h{$1}/rge;'

Depending on the problem, I think I'd just be a bit sloppy and look at every run of non-whitespace. If it's something interesting, I replace it. If not, I put the same text back.

 $ perl5.14.2 -nE 'print s/(\S+)/$h{$1}?$h{$1}:$1/rge}BEGIN{%h=qw(apple green pear yellow plum blue cherry red)'

If the problem is any more complicated than that, my one-liner would look like:

 $ perl fruits2color

Several of the other answers bit up a regex by joining strings. In a non-one-liner program, I'd probably do that with something like Regex::Assemble or Regexp::Trie . Those modules can build efficient alternations.

Who said that hashes can't remember their order :) ?

How can I make my hash remember the order I put elements into it?

Use the Tie::IxHash from CPAN.

 use Tie::IxHash; tie my %myhash, 'Tie::IxHash'; for (my $i=0; $i<20; $i++) { $myhash{$i} = 2*$i; } my @keys = keys %myhash; # @keys = (0,1,2,3,...)

$ perl -MTie::IxHash -pe '
         BEGIN { tie %h, "Tie::IxHash";
                 %h = qw< apple green pear yellow >;
               }
         s<($_)>/$h{$1}/g for keys %h;
        ' file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM