簡體   English   中英

從數組中拆分帶有多個分隔符的字符串(Python)

[英]Split string with multiple separators from an array (Python)

給定一系列分隔符:

columns = ["Name:", "ID:", "Date:", "Building:", "Room:", "Notes:"]

和一些字符串,其中一些列留空(並有隨機的空格):

input = "Name:      JohnID:123:45Date:  8/2/17Building:Room:Notes:  i love notes"

我怎么能得到這個:

["John", "123:45", "8/2/17", "", "", "i love notes"]

我試過簡單地刪除子串,看看我可以從哪里去,但我仍然卡住了

import re
input = re.sub(r'|'.join(map(re.escape, columns)), "", input)

使用列表通過在中間插入(.*)來生成正則表達式,然后使用strip來刪除空格:

import re

columns = ["Name:", "ID:", "Date:", "Building:", "Room:", "Notes:"]
s = "Name:      JohnID:123:45Date:  8/2/17Building:Room:Notes:  i love notes"

result = [x.strip() for x in re.match("".join(map("{}(.*)".format,columns)),s).groups()]

print(result)

收益率:

['John', '123:45', '8/2/17', '', '', 'i love notes']

strip部分可以通過正則表達式處理,代價是更復雜的正則表達式,但更簡單的整體表達式:

result = re.match("".join(map("{}\s*(.*)\s*".format,columns)),s).groups()

更復雜:如果字段數據包含正則表達式特殊字符,我們必須轉義它們(這里不是這種情況):

result = re.match("".join(["{}\s*(.*)\s*".format(re.escape(x)) for x in columns]),s).groups()

使用re.split怎么re.split

>>> import re
>>> columns = ["Name:", "ID:", "Date:", "Building:", "Room:", "Notes:"]
>>> i = "Name:      JohnID:123:45Date:  8/2/17Building:Room:Notes:  i love notes"
>>> re.split('|'.join(map(re.escape, columns)), i)
['', '      John', '123:45', '  8/2/17', '', '', '  i love notes']

為了擺脫空白,也分裂在空白上:

>>> re.split(r'\s*' + (r'\s*|\s*'.join(map(re.escape, columns))) + r'\s*', i.strip())
['', 'John', '123:45', '8/2/17', '', '', '  i love notes']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM