简体   繁体   中英

How to get N consecutive digits from a string?

I am trying to get 4 consecutive digits from a string wherever it is matched.

when I am trying re.sub('[^\\d]+', ',', "abc 23 [1981] ghj [5656]") it returns ,23,1981,5656, .so when i do re.sub('[\\d]{4}+', ',', "abc 23 [2021]") it returns an error "multiple repeat at position 7"

Since I am keeping {4} , shouldn't it match for 4 occurrences of [\\d] and return ,1981,5656, ?

What you want is a little tricky if you want to do it using regex only.

Instead you can use lambda to conditionally replace parts of string that is not four digits with a comma and retain the four digits as it is. Try using this Python code,

import re

s = "abc 23 [1981] ghj [5656]"
print(re.sub(r'\b(\d{4})\b|((?!\b\d{4}\b).)+', lambda x: x.group() if x.group(1) else ',', s))

Prints following just like you wanted,

,1981,5656,

What we are doing here is, capturing four digits using \\b(\\d{4})\\b in group1 and capturing any other one or more characters that doesn't have four digits using this ((?!\\b\\d{4}\\b).)+ regex and we do a conditional replace, where if first alternation is matched then group1 is not empty hence replace it with matched text and if group1 is empty meaning second regex matched, hence replace it using a comma instead.

Use re.findall

Ex:

import re

s = "abc 23 [1981] ghj [5656]"
print(re.findall(r"\[(\d{4})\]", s))

Output:

['1981', '5656']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM