简体   繁体   中英

How to remove numbers from this string in perl?

I have a situation where I need to remove (or substitute) N numbers when they are between literal characters. For example:

00000ASEDF32434VSC

I need 32434 to be removed so that the output is:

00000ASEDFVSC

And if I want to substitute 32434 with XXXXX ?

How to do it using regexes?

EDIT

I chose a very stupid example.Sorry. Consider this:

00000SQWDDWE12CSDCASEDF32434VSC

I need to substitute (or remove) only the pattern 32434.

In general case: if I have an alphanumeric string how to
remove a pattern of N (say 5) numbers when they are between literal characters?

Thanks.

remove a pattern of N (say 5) numbers when they are between literal characters?

What do you mean by "literal characters"? Do you mean "letters"? The following removes a sequence of 5 digits (0-9) that are both preceded and followed by a letter.

$str =~ s/(?<=\pL)[0-9]{5}(?=\pL)//;

Free free to change 5 for a variable.

my $n = 5;
$str =~ s/(?<=\pL)[0-9]{$n}(?=\pL)//;

Since Perl 5.14+, you can use \\d to mean [0-9] by using /a . (Without /a , \\d matches over 400 different characters.)

$str =~ s/(?<=\pL)\d{5}(?=\pL)//a;

We can use a lookahead (?=...) and a lookbehind (?<=...) to assert that the numbers are preceded and followed by non-numbers. This would remove such enclosed numbers:

$str =~ s{ (?<=\D) (\d+) (?=\D) }{}xg;

We can give a different substitution, or even code that will be executed. Here for variable-length X :

$str =~ s{(?<=\D) (\d+) (?=\D)}{ "X" x length $1 }xge;

/e executed the substitution, and x is the underused repetition operator.

Here is a subroutine that returns the string with all such number sequences removed, with optional minimum and maximum lenth possible:

use Carp;
sub remove_numbers {
  my ($string, $min, $max) = @_;
  $min //= 1;
  $max //= "";
  croak qq(argument \$min is not valid) if $min =~ /[^0-9]/;
  croak qq(argument \$max is not valid) if $max =~ /[^0-9]/;
  $string =~ s/(?<=\D) (\d{$min,$max}) (?=\D)/"X" x length $1/xge;
  return $string;
}

The call

$str = remove_numbers($str, 5, 5);

would be equivalent to $str =~ s/(?<=\\D)(\\d{5})(?=\\D)/XXXXX/ . The call

$str = remove_numbers($str);

would be equivalent to my second code example.

Use this code to remove:

$str =~ s/([a-z]+)\d{5}([a-z]+)/\1\2/i;

and this to replace:

$str =~ s/([a-z]+)\d{5}([a-z]+)/\1XXXXX\2/i;

where 5 is how many numbers you must replace.

You can pull out the relevant part to replace, then repack it. I'm assuming it's just a single instance of numbers. That way it doesn't matter how many, but you always replace with the same number of 'X's...

$str = "00000ASEDF32434VSC";

my($prefix, $digits, $suffix) = $str =~ /^(.*[a-z])(\d+)([a-z].*)$/i;

# Replace with X
$str = $prefix . ($digits =~ s/./X/g) . $suffix;

# Or remove
$str = $prefix . $suffix;

You probably want to make sure that the regex succeeded though!

In response to your edit: You actually want to remove a specific number...

$str = $str =~ s/^(.*[a-z])32434([a-z].*)$/$1XXXXX$2/i;

Or if you want 5 digits (this has already been answered)

$str = $str =~ s/^(.*[a-z])\d{5}([a-z].*)$/$1XXXXX$2/i;
$str =~ s/[1-9]//g

This would remove all numbers 1-9. Would this work?

$str =~ s/(\D)\d\d*(\D)/$1$2/g

Should replace only if the numbers are between strings, but not too sure if it works.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM