![](/img/trans.png)
[英]AssertionError: The shape's body must be added to the space before (or at the same time) as the shape
[英]Add a space after a word if it's at the beginning of a string or if it's after one or more spaces, and at the same time it must be at end or before \n
import re
line = "treinta y un" #example 1
line = "veinti un " #example 2
line = "un" #example 3
line = "un " #example 4
line = "uno" #example 5
line = "treinta yun" #example 6
line = "treinta y unghhg" #example 7
re_for_identificate_1 = "(?<!^)un"
re_for_identificate_2 = " un"
line = re.sub(re_for_identificate_2, " un ", line)
line = re.sub(re_for_identificate_1, "un ", line)
print(repr(line))
如何從這些輸入中獲得這些輸出?
"treinta y un " #for example 1
"veinti un " #for example 2
"un " #for example 3
"un " #for example 4
"uno" #for example 5
"treinta yun" #for example 6
"treinta y unghhg" #for example 7
請注意,對於示例 4、5、6 和 7,正則表達式不應進行任何更改,因為在單詞之后已經放置了一個空格,或者因為在"uno"
的情況下,單詞"un"
不在末尾句子的開頭,或者在"treinta yun"
的情況下,substring "un"
前面沒有一個或多個空格。
我不確定你需要正則表達式。 以下代碼似乎可以實現您想要的。
執行三項檢查:
在這里,我將邏輯包裝在列表理解中以進行演示。
lines = ["treinta y un", "veinti un ", "un", "un ",
"uno", "treinta yun", "treinta y unghhg"]
result = [ line+" " if (isinstance(line, str)
and (line[-2:] == "un")
and (line.split()[-1] == "un"))
else line
for line in lines ]
for line in result:
print(f"'{line}'")
Output:
'treinta y un '
'veinti un '
'un '
'un '
'uno'
'treinta yun'
'treinta y unghhg'
如果要使用正則表達式,可以使用\bun$
,它會檢查字符串中的最后一個完整單詞是否為un
,並且字符串中后面沒有任何內容。 如果是這種情況,則在字符串末尾添加一個空格:
import re
lines = ["treinta y un", "veinti un ", "un", "un ",
"uno", "treinta yun", "treinta y unghhg"]
result = [re.sub(r'\bun$', 'un ', line) for line in lines]
Output:
[
'treinta y un ',
'veinti un ',
'un ',
'un ',
'uno',
'treinta yun',
'treinta y unghhg'
]
如果你在你的代碼中聲明line =
,你每次都會覆蓋它。
使用(?<!^)un
斷言字符串的開頭不是直接在左側。
如果您還想排除#un
的匹配項,您可以使用(?<\S)
代替斷言左側的空白邊界。
要確保模式位於字符串的末尾,您可以使用錨$
代碼示例使用單行,但如果您想在多行時進行替換,則必須將多行標志re.MULTILINE
與re.sub一起使用。
例子
import re
pattern = r"(?<!\S)un$"
lines = ["treinta y un", "veinti un ", "un", "un ",
"uno", "treinta yun", "treinta y unghhg", "#un"]
print([re.sub(pattern, 'un ', line) for line in lines])
Output
[
'treinta y un ',
'veinti un ',
'un ',
'un ',
'uno',
'treinta yun',
'treinta y unghhg',
'#un'
]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.