簡體   English   中英

python從文件中刪除特定行

[英]python remove specific lines from file

我想刪除此HTML文件中的特定行。 我想查看字符串STARTDELETE在哪里,並從那里+1刪除到字符串ENDDELETE -1

為了更好的理解,我用“ xxx”標記了要刪除的行。 我該如何用python做到這一點?

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Bootstrap Example</title>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</head>
<body>
  <div class="container">
    <h2>Image Gallery</h2>
    <div class="row"> <!--STARTDELETE-->
      xxx<div class="col-xs-3">
        xxx<div class="thumbnail">
          xxx<a href="/w3images/lights.jpg" target="_blank">
          xxx<img  style="padding: 20px" src="xxx" alt="bla" >
          xxx<div class="caption">
            xxx<p>Test</p>
          xxx</div>
        xxx</a>
        xxx</div>
      xxx</div>
    </div> <!--ENDDELETE-->
  </div>
</body>
</html>

您可以首先將該代碼復制並粘貼到輸入文件中,該文件可能名為“ input.txt”,然后將要保留的行輸出到“ output.txt”。 忽略要刪除的行。

w = open("output.txt", "w")  # your output goes here
delete = False
with open("input.txt") as file:
    for line in file:
        if "<!--ENDDELETE-->" in line:
            delete = False # stops the deleting
        if not delete:
            w.write(str(line))
        if "<!--STARTDELETE-->" in line:
            delete = True # starts the deleting
w.close() # close the output file

希望這可以幫助!

安裝beautifulsoup4 (HTML解析器/ DOM操作器)

讀取數據,獲取一個帶有beautifulsoup的“ DOM”(一種可步行的結構),獲取您想清空的項目,並刪除其子項

在您的示例中,您似乎想清空class=row <div>(s) ,對嗎? 假設您的HTML數據存儲在名為data.html的文件中(在您的特定情況下,可能不會像這樣...它將是請求的正文或類似內容)

from bs4 import BeautifulSoup
with open('data.html', 'r') as page_f:
    soup = BeautifulSoup(page_f.read(), "html.parser")
    # In `soup` we have our "DOM tree"

divs_to_empty = soup.find("div", {'class': 'row'})
for child in divs_to_empty.findChildren():
    child.decompose()

print(soup.prettify())

輸出:

<!DOCTYPE html>
<html lang="en">
 <head>
  <title>
   Bootstrap Example
  </title>
  <meta charset="utf-8"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet"/>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js">
  </script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js">
  </script>
 </head>
 <body>
  <div class="container">
   <h2>
    Image Gallery
   </h2>
   <div class="row">
    <!--STARTDELETE-->
   </div>
   <!--ENDDELETE-->
  </div>
 </body>
</html>

如果您要進行DOM操作,我強烈建議您閱讀並玩點漂亮的湯(功能很強大)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM