简体   繁体   English

Python - 使用正则表达式将数字和字母拆分为子字符串

[英]Python - Splitting numbers and letters into sub-strings with regular expression

I am creating a metric measurement converter.我正在创建一个公制测量转换器。 The user is expected to enter in an expression such as 125km (a number followed by a unit abbreviation).用户应输入诸如125km (一个数字后跟单位缩写)之类的表达式。 For conversion, the numerical value must be split from the abbreviation, producing a result such as [125, 'km'] .对于转换,必须从缩写中拆分数值,从而产生诸如[125, 'km'] I have done this with a regular expression, re.split , however it produces unwanted item in the resulting list:我使用正则表达式re.split完成了此re.split ,但是它会在结果列表中生成不需要的项目:

import re
s = '125km'
print(re.split('(\d+)', s))

Output:输出:

['', '125', 'km']

I do not need nor want the beginning '' .我不需要也不想要开头'' How can I simply separate the numerical part of the string from the alphabetical part to produce a list using a regular expression?如何简单地将字符串的数字部分与字母部分分开以使用正则表达式生成列表?

What's wrong with re.findall ? re.findall什么问题?

>>> s = '125km'
>>> re.findall(r'[A-Za-z]+|\d+', s)
['125', 'km']

[A-Za-z]+ matches one or more alphabets. [A-Za-z]+匹配一个或多个字母。 | or \\d+ one or more digits.\\d+一位或多位数字。

OR或者

Use list comprehension.使用列表理解。

>>> [i for i in re.split(r'([A-Za-z]+)', s) if i]
['125', 'km']
>>> [i for i in re.split(r'(\d+)', s) if i]
['125', 'km']

Split a string into list of sub-string (number and others)将字符串拆分为子字符串列表(数字和其他)

Using program:使用程序:

s = "125km1234string"
sub = []
char = ""
num = ""
for letter in s:
    if letter.isdigit():
        if char:
            sub.append(char)
            char = ""
        num += letter
    else:
        if num:
            sub.append(num)
            num = ""
        char += letter
sub.append(char) if char else sub.append(num)
print(sub)

Output输出

['125', 'km', '1234', 'string']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM