简体   繁体   中英

In Perl, how can I use the regex substitution operator to replace non-ASCII characters in a substring?

How to use this command:

perl -pi -e 's/[^[:ascii:]]/#/g' file

to change only characters at offset A to offset B of each line?

Under reservation that I didn't understand your question correctly, if the offsets A, and B are 5 and 10, then it should be like:

  perl -pi -e 's/(?<=.{5})(?<!.{10})[^[:ascii:]]/#/g' file

Explanation:

   [^[:ascii:]]  <- the character which is looked for
   (?<=.{5})     <- if at least 5 chars were before (offset 5)
   (?<!.{10})    <- but no more than 10 characters before (offset 10)

The constructs:

   (?<= ...) and (?<! ...)

are called positive and negative lookbehinds , which are zero-with assertions . (You can google them, see section Look-Around Assertions in perlre )


Addendum 1 You mentioned substr() in your title, which I overlooked first. This would work, of course, too:

  perl -pi -e 'substr($_,5,10)=~s/[^[:ascii:]]/#/g' file 

The description of substr EXPR,OFFSET,LENGTH can be found in perldoc . This example nicely illustrates the use of substr() as a left-value.


Addendum 2 When updating this post, Grrrr added the same solution as an answer, but his came first by a minute! (so he deserves the booty)

Regards

rbo

Alternatively to rubber boots' answer, you can operate on a substring instead of the whole string to begin with:

perl -pi -e 'substr($_, 5, 5) =~ s/[^[:ascii:]]/#/g' file

To illustrate:

perl -e 'print "\xff" x 16' | \
perl -p -e 'substr($_, 5, 5) =~ s/[^[:ascii:]]/#/g' | \
hd

will print

ff ff ff ff ff 23 23 23  23 23 ff ff ff ff ff ff

In this code, the first offset is 0-based, and you have to use the length instead of the second offset, so it will be substr($_, A-1, BA) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM