In Perl, how can I use the regex substitution operator to replace non-ASCII characters in a substring?

Question

How to use this command:

perl -pi -e 's/[^[:ascii:]]/#/g' file

to change only characters at offset A to offset B of each line?

Answer 1

Under reservation that I didn't understand your question correctly, if the offsets A, and B are 5 and 10, then it should be like:

  perl -pi -e 's/(?<=.{5})(?<!.{10})[^[:ascii:]]/#/g' file

Explanation:

   [^[:ascii:]]  <- the character which is looked for
   (?<=.{5})     <- if at least 5 chars were before (offset 5)
   (?<!.{10})    <- but no more than 10 characters before (offset 10)

The constructs:

   (?<= ...) and (?<! ...)

are called positive and negative lookbehinds , which are zero-with assertions . (You can google them, see section Look-Around Assertions in perlre )

Addendum 1 You mentioned substr() in your title, which I overlooked first. This would work, of course, too:

  perl -pi -e 'substr($_,5,10)=~s/[^[:ascii:]]/#/g' file

The description of substr EXPR,OFFSET,LENGTH can be found in perldoc . This example nicely illustrates the use of substr() as a left-value.

Addendum 2 When updating this post, Grrrr added the same solution as an answer, but his came first by a minute! ^{(so he deserves the booty)}

Regards

rbo

Answer 2

Alternatively to rubber boots' answer, you can operate on a substring instead of the whole string to begin with:

perl -pi -e 'substr($_, 5, 5) =~ s/[^[:ascii:]]/#/g' file

To illustrate:

perl -e 'print "\xff" x 16' | \
perl -p -e 'substr($_, 5, 5) =~ s/[^[:ascii:]]/#/g' | \
hd

will print

ff ff ff ff ff 23 23 23  23 23 ff ff ff ff ff ff

In this code, the first offset is 0-based, and you have to use the length instead of the second offset, so it will be substr($_, A-1, BA) .

In Perl, how can I use the regex substitution operator to replace non-ASCII characters in a substring?

Question

2 answers

solution1
7 2012-06-28 09:14:08

solution2
7 ACCPTED 2012-06-28 09:37:38

In Perl, how can I use the regex substitution operator to replace non-ASCII characters in a substring?

Question

2 answers

solution1 7 2012-06-28 09:14:08

solution2 7 ACCPTED 2012-06-28 09:37:38

solution1
7 2012-06-28 09:14:08

solution2
7 ACCPTED 2012-06-28 09:37:38