I would like to get number in between these strings.
strings = ["point_right: account ISLAMIC: 860328 9221 asdsad",
"account 723123123",
"account823123213",
"account 823.123.213",
"account 823-123-213",
"account:123213123 ",
"account: 123213123 asdasdsad 017-299906",
"account: 123213123",
"point_right: account ISLAMIC: 860328 9221"
]
Result would be
[860328 9221,723123123, 823123213, 823.123.213, 823-123-213, 123213123, 123213123, 123213123]
And i can do processing later to make them into number. So far my strategy is to get everything after pattern and anything before a letter. I have tried:
for string in strings:
print(re.findall("(?<=account)(.*)", string.lower()))
Please help to give some pointers on the regex match.
Try this pattern:
(?=[^0-9]*)[0-9][0-9 .-]*[0-9]
Breakdown:
(?=[^0-9]*)
Lookahead for a word, such as "account", non-matching [0-9]
Find a digit [0-9 .-]*
Find any number of digits or special characters (in your strings you have spaces, dashes, periods so I included those) [0-9]
Find another digit (to prevent spaces at the end) (?!\W)([\d\s.-]+)(?<!\s)
The negative lookahead and lookbehind seems like overkills here but I wasn't able to get a clean match otherwise. You may see the results here
(?!\\W)
Negative lookahead to exclude any non-word characters [^a-zA-Z0-9_]
([\\d\\s.-]+)
The capturing group for your numbers
(?<!\\s)
Negative lookbehind to exclude whitespace characters [\\r\\n\\t\\f\\v ]
If the numbers must be the first numbers after the account
substring use
re.findall("account\D*([\d\s.-]*\d)", s)
See the Python demo and the regex demo .
Pattern details
account
- a literal substring \\D*
- 0+ chars other than digits ([\\d\\s.-]*\\d)
- Capturing group 1 (the value returned by re.findall
): 0 or more digits, whitespaces, .
and -
chars followed with a digit.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.