简体   繁体   English

使用Python正则表达式在字符之间拆分字符串

[英]Split string between characters with Python regex

I'm trying to split the string: 我正在尝试拆分字符串:

> s = Ladegårdsvej 8B7100 Vejle

with a regex into: 正则表达式:

[street,zip,city] = ["Ladegårdsvej 8B", "7100", "Vejle"]

s varies a lot, the only certain part is that there are always 4 digits in the zip and a whitespace afterwards. s变化很大,唯一确定的部分是拉链中总有4位数字,之后是空格。 My idea is thus to "match from the right" on 4 digits and a whitespace to indicate that the string should be split at that point in the string. 因此,我的想法是在4位数字和空格上“从右边匹配”,以指示字符串应该在字符串中的该点处拆分。

Currently I'm able to get street and city like this: 目前我能够像这样得到streetcity

> print re.split(re.compile(r"[0-9]{4}\s"), s)
["Ladegårdsvej 8B", "Vejle"]

How would I go about splitting s as desired; 我将如何根据需要分割s ; in particular, how to do it in the middle of the string between the number in street and zip ? 特别是,如何在streetzip之间的字符串中间做到这一点?

You can use re.split , but make the four digits a capturing group: 您可以使用re.split ,但将四个数字作为捕获组:

>>> s = "Ladegårdsvej 8B7100 Vejle"
>>> re.split(r"(\d{4}) ", s)
['Ladegårdsvej 8B', '7100', 'Vejle']

From the documentation (emphasis mine) 文档 (强调我的)

Split string by the occurrences of pattern. 按模式的出现拆分字符串。 If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. 如果在模式中使用捕获括号,则模式中所有组的文本也将作为结果列表的一部分返回。 If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list. 如果maxsplit非零,则最多发生maxsplit拆分,并且字符串的其余部分将作为列表的最后一个元素返回。

一旦你有街道,获得拉链是微不足道的:

zip = s[len(street):len(street)+4]

Here is the solution for your problem. 这是您的问题的解决方案。

# -*- coding: utf-8 -*-
import re
st="Ladegårdsvej 8B7100 Vejle"
reg=r'([0-9]{4})'
rep=re.split(reg,st)
print rep

Solution for other test cases as provided by RasmusP_963 sir. RasmusP_963先生提供的其他测试用例的解决方案。

# -*- coding: utf-8 -*-
import re
st="Birkevej 8371900 Roskilde"
print re.split(r"([0-9]{4}) ",st)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM