簡體   English   中英

從文本文件中刪除以 Python 中的特定表達式開頭的單詞

[英]Delete words from text file which start with a specific expression in Python

我有一個 XML 文件並將一些特定信息提取到一個新的 txt 文件中。 一切正常,除了在我的新文本文件中有一些我想刪除的單詞。 所有這些詞都以“iVB”開頭,並有不同的結尾。

我的代碼如下:

import xml.etree.ElementTree as ET
tree=ET.parse(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Description.xml")

with open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Activity.txt", "w")as f:

    for instance in tree.findall(".//ATTRIBUTE"):
        if instance.get("type")== "LONGSTRING":
            f.write(f"{instance.text}\n")

s=open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Activity.txt").read()
s=s.replace(";", "\n").lower()
f=open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Activity1.txt", "w")
f.write(s)
f.close() 

我知道我可以使用正則表達式和xx= re.sub(r'\biVB\d*\b', '', x)但是我真的不知道如何將它放入 txt.file

我的 XML 文件看起來像這樣。

<MODELS>
<MODEL version="" applib="ADOxx 1.5 Dynamic Experimentation Library" libtype="bp" modeltype="Role Model" name="Role Model - new" id="mod.50201">
<MODELATTRIBUTES>
<ATTRIBUTE name="Version number" type="STRING"/>
<ATTRIBUTE name="Author" type="STRING">Admin</ATTRIBUTE>
<ATTRIBUTE name="Creation date" type="STRING">15.10.2019, 16:23</ATTRIBUTE>
<ATTRIBUTE name="Date last changed" type="STRING">12.11.2019, 10:44:05</ATTRIBUTE>
<ATTRIBUTE name="Last user" type="STRING">Admin</ATTRIBUTE>
<ATTRIBUTE name="Keywords" type="STRING"/>
<ATTRIBUTE name="Comment" type="STRING"/>
<ATTRIBUTE name="Model type" type="ENUMERATION">Current model</ATTRIBUTE>
<ATTRIBUTE name="State" type="ENUMERATION">In process</ATTRIBUTE>
<ATTRIBUTE name="Reviewed on" type="STRING"/>
<ATTRIBUTE name="Reviewed by" type="STRING"/>
<ATTRIBUTE name="Description" type="STRING"/>
<ATTRIBUTE name="Number of objects and relations" type="INTEGER">3</ATTRIBUTE>
<ATTRIBUTE name="World area" type="STRING">w:20.05cm h:28.75cm minw:5cm minh:5cm</ATTRIBUTE>
<ATTRIBUTE name="Grid" type="STRING"/>
<ATTRIBUTE name="Zoom" type="INTEGER">100</ATTRIBUTE>
<ATTRIBUTE name="Viewable area" type="STRING">VIEW representation:graphic GRAPHIC x:-16 y:-16 w:1159 h:905 scale:1 TABLE </ATTRIBUTE>
<ATTRIBUTE name="Current mode" type="STRING"/>
<ATTRIBUTE name="Base name" type="STRING">Role Model - new</ATTRIBUTE>
<ATTRIBUTE name="Access state" type="ENUMERATION">write</ATTRIBUTE>
<ATTRIBUTE name="Current page layout" type="STRING"/>
<ATTRIBUTE name="Connector marks" type="STRING"/>
<ATTRIBUTE name="Type" type="STRING">Role Model</ATTRIBUTE>
<ATTRIBUTE name="Change counter" type="INTEGER">31</ATTRIBUTE>
<ATTRIBUTE name="Font size" type="INTEGER">0</ATTRIBUTE>
<ATTRIBUTE name="Context of version" type="STRING"/>
<ATTRIBUTE name="Position" type="STRING"/>
<ATTRIBUTE name="External tool coupling" type="STRING"/>
<ATTRIBUTE name="__GfxThumb__" type="LONGSTRING">iVBORw0KGgoAAAANSUhEUgAAALIAAAEACAIAAACYnbv1AAAACXBIWXMAAAsTAAALEwEAmpwYAAAE+klEQVR4nO3cP2hVdxjH4VeIBIxoCKYoiKIkByyxf6CriKNk6OAidBUytosg4igidKljwFVwEewQHEVcK01LKDQJSkVQmpKSixFEwQ6lwSbfFu9wz7nD80wnv+V9CZ+ce+8J3F3v3r0r+LeRrhdgGMmCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBO1l8eLFb1V18ODR1ib2a/g3bM2udr5SbXn5h63rpvmihYn9Gv4N29TG3eLBg1tVtXfvRFW9fLn+/Pmvp09/1cLcD7dtw66ymJiYqKr19fVOpr+vjSz275+M18NjGDbct2/f1kWv1+tkhy0tvYg8fry4dX38+GctTOxXtxuOjY1V1atXr/7+cc+ePZubmy3v8L42srh+/Y+qunDhdVXdvDlaVZcuHRj00L50vuHo6GhVPXr06OTJk1uHHX6T7sCzuHz58dTM9sPVpbp27fhA5364YdhwZGRkcXHx2bOHR458UlVPn/58+PCpmZkda7W2z6AH9Hq93//ceTjosX0Yhg3fvn27tLQ0Pn6o11urqvHxQ62O32HgWWxsbIzu+KVvbAx6bB+GZMOZmZmlpdq9e7Oq3rwZ6/BWUW18Evko/C3WRwMf24fh2PDKkys1Vuc3z1fV7bHb9aSuHrva9hL/GOx7izNnblXV6LHt56+fVFXdv9/904sh2XDux7mHPz3cdnjq01Pzn8+3s8A2LX1A5f+du39ubWNt2+Hk/sk7Z+50so9/lQ2FtY21yR2P0XaG0hpZDIUTR0/sPDww3tnTHS8i3Zv8brKq5r6c23Y+//18Va1908E9QxYEXkQIZEEgCwJZEMiCYFBZrN67sVCzs9NVtbKwsNxULVfTzE7XwsJyU7/crYsXv56aGtDwfta8V2dr5UbV7PTZWrnx7d3m4vzZ7vfq2oCyWF2pplleWKiqapqmqVquu1XNSjVNU1V1d2WqhuC3P1V1b7Wmm1pYqemqqo+nh2Cr7nXz3GJ1dXVqCO4V/BePswi85SSQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCGRBIAsCWRDIgkAWBLIgkAWBLAhkQSALAlkQyIJAFgSyIJAFgSwIZEEgCwJZEMiCQBYEsiCQBYEsCP4CV7z1AVse47wAAAAASUVORK5CYII=</ATTRIBUTE>
</MODELATTRIBUTES

嘗試這個:

with open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Activity.txt", "r") as f:
    lines = f.readlines()

with open(r"C:\Users\benni\Google Drive\MASTER\Masterarbeit\Coden\Activity.txt", "w") as f:
    for line in lines:
        if line.strip("\n").startswith("iVB") is False:
            f.write(line)

也許這可以幫助你。 祝你好運!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM