![](/img/trans.png)
[英]How to replace specific sub-string of a string using python while we have multiple sub-string?
[英]How to find the span of multiple sub-string in one string using python?
我有一個像
xxxx BP 160/110
坐在左邊 arm @xyz 醫院 xxxx HgbA1c HgbA1c 12%
on 21/1/2019 xxxx
和另一個字符串
bp 160/110 hgba1c 12%
現在,我怎樣才能得到每個發現的跨度如下
[(5, 15), (62, 72)]
注意:上述模式可能會有很大差異。 所以我想實現一些動態的解決方案。
提前致謝
這個 function 將找到包含 substring 的最小和最大邊界(否則,它將返回 False)
import collections
def find_substring_bounds(a, b):
need = collections.Counter(b)
missing = len(b)
for end, char in enumerate(a, 1):
if need[char] > 0:
missing -= 1
need[char] -= 1
if missing == 0: # found all the characters
start = 0
while start < end and need[a[start]] < 0:
need[a[start]] += 1
start += 1
need[a[start]] += 1
return start, end
return False
然后我們需要找到中左和中右邊界:
def mid_left_mid_right(a, b, left, right):
mid_left = left
for mid_left, (c1, c2) in enumerate(zip(a[left:], b)):
if c1 != c2:
break
mid_right = right
for mid_right, (c1, c2) in enumerate(zip(a[:right][::-1], b[::-1])):
if c1 != c2:
break
return [(left, left+mid_left), (right-mid_right, right)]
例子:
s1 = "xxxx BP 160/110 12/6/2018 sitting left arm @xyz hospital xxxx HgbA1c 12% on 21/1/2019 xxxx"
s2 = "bp 160/110 hgba1c 12%"
left_, right_ = find_substring_bounds(s1.lower(), s2)
res = mid_left_mid_right(s1.lower(), s2, left_, right_)
print(res)
輸出:
[(5, 16), (61, 72)]
您可能需要針對數據集中的任何邊緣情況對此進行修改。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.