简体   繁体   English

在字符的第一次和最后一次出现时拆分?

[英]Split at first and last occurrence of a character?

I have a list of strings as such (amount, address, payment):我有一个字符串列表(金额、地址、付款):

"44.53 54 orchard rd Cash"
"32.34 600 sprout brook lane Card"

I am just trying to get the address from each string.我只是想从每个字符串中获取地址。 It seems to me the best way to go about this would be to split at the first and last occurrence of a space.在我看来,最好的方法是在空间的第一次和最后一次出现时拆分。 Is there any way to do this?有没有办法做到这一点?

Python split function is defined like this: str.split(sep=None, maxsplit=-1) . Python split函数定义如下: str.split(sep=None, maxsplit=-1)

Similarly, there is str.rsplit(sep=None, maxsplit=-1) .同样,还有str.rsplit(sep=None, maxsplit=-1)

This means that you can split off just the beginning and the ending:这意味着您可以将开头和结尾分开:

>>> s = "44.53 54 orchard rd Cash"
>>> s.split(maxsplit=1)
['44.53', '54 orchard rd Cash']
>>> s.rsplit(maxsplit=1)
['44.53 54 orchard rd', 'Cash']

Then, to simply split the string into 3, you can write a simple function:然后,为了简单地将字符串拆分为 3,您可以编写一个简单的函数:

>>> def purchase_parts(purchase):
...     lsplit = purchase.split(maxsplit=1)
...     rsplit = lsplit[1].rsplit(maxsplit=1)
...     return (lsplit[0], rsplit[0], rsplit[1])
... 
>>> purchase_parts("44.53 54 orchard rd Cash")
('44.53', '54 orchard rd', 'Cash')
>>> purchase_parts("32.34 600 sprout brook lane Card")
('32.34', '600 sprout brook lane', 'Card')

Still, I would suggest to switch to separated value list, because then you can just split using that separator, but also directly support importing/exporting of csv format (comma separated values) files.尽管如此,我还是建议切换到分隔值列表,因为这样您就可以使用该分隔符进行拆分,而且还直接支持导入/导出 csv 格式(逗号分隔值)文件。

Manual solution:手动解决:

>>> [p.strip() for p in "32.34, 600 sprout brook lane, Card".split(',')]
['32.34', '600 sprout brook lane', 'Card']

You could potentially do something like:您可能会执行以下操作:

line = "44.53 54 orchard rd Cash"
line_parts = line.split(" ")
address = " ".join(line_parts[1:-1])

It's a bit untidy and definitely brittle to changes in line format, but would do the job.它有点凌乱,并且对于行格式的更改肯定很脆弱,但可以完成这项工作。

You can use your method, splitting at the first and last spaces, but you need to join back the rest (in the middle):您可以使用您的方法,在第一个和最后一个空格处拆分,但您需要将其余部分(在中间)连接起来:

def get_address(s):
    s = s.split()
    return ' '.join(s[1:-1])
    # s[1:-1] will remove the first (amount) and the last (payment) values
    # ' '.join will then put back the spaces that were removed from the address by s.split

Input:输入:

print(get_address("44.53 54 orchard rd Cash"))
print(get_address("32.34 600 sprout brook lane Cash"))

Output:输出:

54 orchard rd
600 sprout brook lane

You could also use a regular expression to be a bit more flexible and robust.您还可以使用正则表达式来更加灵活和健壮。 Here, the first two \\d+ elements say that you must at first have two digits separated by a dot, then a space, then your address as returned result (in parenthesis () ) consisting of any characters ( \\w ) or ( [] ) whitespace characters ( \\W ) until a space and another sequence of characters ( \\w+ ).在这里,前两个\\d+元素表示您必须首先有两个数字,由一个点分隔,然后是一个空格,然后是您的地址作为返回结果(在括号() )由任何字符 ( \\w ) 或 ( [] ) 空白字符 ( \\W ) 直到一个空格和另一个字符序列 ( \\w+ )。

import re

addresses = [
    "44.53 54 orchard rd Cash",
    "32.34 600 sprout brook lane Card"
]

addresses = [re.findall(r'\d+\.\d+ ([\w\W]+) \w+', address)[0] for address in addresses]
print(addresses)  # ['54 orchard rd', '600 sprout brook lane']

You could get the first and last using unpacking and the reassemble then rest to form the address:您可以使用解包获得第一个和最后一个,然后重新组装然后休息以形成地址:

amount,*rest,payment = s.split()
address = " ".join(rest)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM