簡體   English   中英

Python正則表達式搜索並替換字符串后的所有字符

[英]Python regex search and replace all characters after a string

我正在嘗試做一些我認為很簡單的事情,但是我在使用正則表達式時遇到了一些麻煩。 具體來說,我想在同一行上找到CAUGHT AN ERROR及其后的所有內容,並將其替換為CAUGHT AN ERROR: XXXXX .. 我的理解是使用.*$ ( example ) 將允許我搜索到最后行,但我沒有使用 for 循環獲得准確的替換。 如何在我搜索的字符后替換所有內容?

1970-01-01 10:59:02     
1970-01-01 10:59:02    
1970-01-01 10:59:01    CAUGHT AN ERROR: rmv: cannot remove '/media/^Red^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /)
1970-01-01 10:59:01    CAUGHT AN ERROR: rmv: cannot remove '/media/^Green^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /media/ug)
1970-01-01 10:59:02    CAUGHT AN ERROR: rmv: cannot remove '/media/^Blue^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /medi0349223^BradbuXXXXXX.jpg)
1970-01-01 10:59:02    CAUGHT AN ERROR: rmv: cannot remove '/media/^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /media/usb0 XXXXXX.jpg)
1970-01-01 10:59:02    CAUGHT AN ERROR: rmv: cannot remove '/media/^Orange^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write )
1970-01-01 10:59:02  

我將上面的示例日志保存在一個文件中,然后執行以下代碼:

with open(r'C:\Users\Downloads\LOG\sample.log', mode='r', encoding='utf8') as log_r:
    content = log_r.read()

dict_items = {r'CAUGHT AN ERROR: [A-Z|a-z|0-9|\.|\-|\,|\_|\{|\}|\)|\(|\/]*\+': r'CAUGHT AN ERROR: XXXXXX'}

for k, v in dict_items.items():
    content = re.sub(k, v, content)

print(content)

在我的字典里,我也嘗試過,但無濟於事。

r'CAUGHT AN ERROR: .\$'
r'CAUGHT AN ERROR: .*$'

預期結果

1970-01-01 10:59:02     
1970-01-01 10:59:02    
1970-01-01 10:59:01    CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:01    CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02    CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02    CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02    CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02  

r'CAUGHT AN ERROR: .*$'是正確r'CAUGHT AN ERROR: .*$'正則表達式。 但是您需要使用re.MULTILINE標志,以便$將匹配行的結尾,而不是整個字符串的結尾。

dict_items = {r'CAUGHT AN ERROR: .*$': r'CAUGHT AN ERROR: XXXXXX'}
for k, v in dict_items.items():
    content = re.sub(k, v, content, flags=re.MULTILINE)

print(content)

演示

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM