简体   繁体   English

将字符串拆分为字符串和数字

[英]split string into the string and digits

I'm trying to split this string ( incoming_string ) to digits and string to the next view ( result ):我正在尝试将此字符串( incoming_string )拆分为数字并将字符串拆分为下一个视图( result ):

incoming_string = '02, 102, 702New York' # Also possible incoming strings: '06, 25Jerusalem' or '34Saint Luise'

result = {'New York': ['02', '102', '702']}

I'm found this approcach, but I think this is not a best way:我找到了这种方法,但我认为这不是最好的方法:

import re


digits = re.findall('\d+', incoming_string)  # ['02', '102', '702']
strings = re.findall('[a-z, A-Z]+', incoming_string)[-1]  # 'New York'

By best way I mean a most concise, understandable and pythonic way, preferable without imports.最好的方式我的意思是最简洁、易懂和 Pythonic 的方式,最好不要导入。 All symbols are the same encoding (ascii)所有符号都是相同的编码(ascii)

Use this:用这个:

(\\d{2}), (\\d{3}), (\\d{3})(.+)

Demo演示

Code:代码:

import re

incoming_string = '02, 102, 702New York'
print(re.sub("(\d{2}), (\d{3}), (\d{3})(.+)", "{\\4: [\\1, \\2, \\3]}", incoming_string))

ouput:输出:

{New York: [02, 102, 702]}

The trouble is finding the index where the list stops and the key value starts.麻烦的是找到列表停止和键值开始的索引。 We can make a function that helps us find the first "non-list character".我们可以创建一个函数来帮助我们找到第一个“非列表字符”。 Then it's a matter of splitting the string in two using that index, after which we can split the first part into a list using the ", " delimiter.然后就是使用该索引将字符串一分为二,之后我们可以使用", "分隔符将第一部分拆分为一个列表。

def get_first_non_list_char_index(incoming_string):
    for i, c in enumerate(incoming_string):
        if c not in "1234567890, ":
            return i

incoming_string = "02, 102, 702New York"
char_index = get_first_non_list_char_index(incoming_string)

result = {incoming_string[char_index:]: incoming_string[:char_index].split(", ")}

result = {'New York': ['02', '102', '702']}

import less solution as requested:按要求import较少的解决方案:

incoming_string = '02, 102, 702New York'
letters = ''.join(i for i in incoming_string if i.isalpha() or i.isspace())
numbers = ''.join(i for i in incoming_string if i.isdigit() or i.isspace())
result = {letters.strip(): numbers.split()}
print(result)

Output:输出:

{'New York': ['02', '102', '702']}

You can use re.split to get this result:你可以使用re.split来得到这个结果:

sl=['02, 102, 702New York',  
'06, 25Jerusalem',
'34Saint Luise']

import re

for s in sl:
    fields=re.split(r'(?<=\d)(?=[a-zA-Z])', s, maxsplit=1)
    print(s, "=>", {fields[1]:re.split(r',[ ]*',fields[0])})

Prints:印刷:

02, 102, 702New York => {'New York': ['02', '102', '702']}
06, 25Jerusalem => {'Jerusalem': ['06', '25']}
34Saint Luise => {'Saint Luise': ['34']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM