简体   繁体   中英

Python string split in a specific pattern

I am trying to split a string in this specific pattern:

 'ff19shh24c' -> ['f', 'f', '19s', 'h', 'h', '24c']

I managed to get this close:

 import re string = "ff19shh24c" parts = re.findall(r'\D+|\d+[az]{1}') print(parts) -> ['ff', '19s', 'hh', '24c']

But now I am a little bit stuck.

Search for anything (non-greedy) and then a letter.

 import re string = "ff19shh24c" parts = re.findall(r'.*?[az]', string) print(parts)

This will give you ['f', 'f', '19s', 'h', 'h', '24c']

One possibility, find zero or more digits, then a non-digit:

 import re string = 'ff19shh24c' parts = re.findall('\d*\D', string)

output: ['f', 'f', '19s', 'h', 'h', '24c']

Since question not tagged with regex or similar here a for loop approach

s = 'ff19shh24c' out = [] tmp = '' was_a_digit = False # keep track if the previous character was a digit for char in s: if char.isdigit(): was_a_digit = True tmp += char else: if was_a_digit: tmp += char out.append(tmp) tmp = '' was_a_digit = False else: out.append(char) print(out) #['f', 'f', '19s', 'h', 'h', '24c']

In case of strings which end with digits the above code will loose these characters but with a slight edit one can still retrieve them.

Here the approach with conservation of characters :

 s = 'ff19shh24cX29ZZ88'... same as above # directly after the end of the for loop out.append(tmp) print(out) ['f', 'f', '19s', 'h', 'h', '24c', 'X', '29Z', 'Z', '88']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM