简体   繁体   中英

How to get all the substrings in string using Regex in Python

I have a string such as: "12345"

using the regex, how to get all of its substrings that consist of one up to three consecutive characters to get an output such as:

'1', '2', '3', '4', '5', '12', '23', '34', '45', '123', '234', '345'

You can use re.findall with a positive lookahead pattern that matches a character repeated for a number of times that's iterated from 1 to 3:

[match for size in range(1, 4) for match in re.findall('(?=(.{%d}))' % size, s)]

However, it would be more efficient to use a list comprehension with nested for clauses to iterate through all the sizes and starting indices:

[s[start:start + size] for size in range(1, 4) for start in range(len(s) - size + 1)]

Given s = '12345' , both of the above would return:

['1', '2', '3', '4', '5', '12', '23', '34', '45', '123', '234', '345']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM