简体   繁体   中英

Identify words from the string with regex

I have the following text:

your salary $4500 is deposited in account ABC09-234-1234
your salary $4500 is deposited in account abc09-234-1234

I try with (\\d+)|([A-Z0-9-]+) regex but it's not working with small letters.

I want to fetch $4500 and Account Number . Please help me with it .

Two options:

  • use [A-Za-z0-9] .
  • use the i regex modifier, to make it case insensitive.

Bit like this:

/(\$\d+)|([A-Z0-9-]+)$/i

Edit: In light of your 'end of line' not being a firm anchor:

(\$\d+)|\b([A-Z0-9]*-[A-Z0-9]*)\b

This captures a sequence of letters and digits that must include a - symbol instead.

But you can perhaps simplify - if you assume that the only things you are interested are the substrings ending with a digit (which your examples are)

/(\S*\d)/

Will match on both your lines:

Demo

You can use the following regex:

(?<salary>\$\d+)|\b(?<account>[a-zA-Z0-9]+(?:-[0-9]+)+)

See demo

This regex will match $4500 s like substrings (everywhere in the string) and ABC09-234-1234 -like strings.

Assumptions :

  • Money values will always start with $ and not contain spaces of any other non-numeric character
  • Account number is always formatted in 3 'chunks' separated by -

Solution :

(\$\d+)|([A-Za-z0-9]+-[A-Za-z0-9]+-[A-Za-z0-9]+)

Here is a working example

It's not very clear what you want, but this solution may help you

use strict;
use warnings;

while ( <DATA> ) {
  my @words = grep /\d/, split;
  print "@words\n";
}


__DATA__
your salary $4500 is deposited in account ABC09-234-1234
your salary $4500 is deposited in account abc09-234-1234

output

$4500 ABC09-234-1234
$4500 abc09-234-1234

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM