[英]Delete specific line from txt file in python
我正在從 url (input.txt) 的列表中抓取數據並將數據保存在 output.txt 中
我想在循環中抓取這些 url 后立即從輸入文件中刪除它們。
這是我的代碼:
def scrape(url):
//do scraping and return json
return json
with open("input.txt",
'r+') as urllist, open('output.txt',
'a+') as outfile:
for url in urllist.read().splitlines():
data = scrape(url)
if data:
if data['products'] is None:
print("data NOT FOUND: %s")
else:
for product in data['products']:
print("Saving data: %s" % product['data'])
outfile.write(product['data'])
outfile.write("\n")
我已將此代碼包含在循環中以在它通過循環時刪除 url 但它會立即刪除所有網址而不是一個一個地刪除
#start new code
d = urllist.readlines()
urllist.seek(0)
for i in d:
if i != url:
urllist.write(i)
input.txt 文件包含以下數據:
url1
url2
url3
而 output.txt 文件:
data1
data2
data3
我指的是這段代碼
我分享了一個在使用該行后從文件中刪除該行的示例。 請注意,我添加了一個名為“printFileContents”的 function 來向您展示每次抓取迭代后文件內容發生了什么。 function 實際上並不是必需的,只是很好地可視化正在發生的事情。 請參見下面的示例:
def scrape(url):
# Do some stuff
return True
def executeScrapeIteration(input_file):
# Get the first line in the file
url = input_file.readline()
# Do your scraping and whatever else
scrape(url)
# To remove the line you just used, you have to rewrite the file, but don't include that line
lines = input_file.readlines()
input_file.seek(0)
input_file.truncate()
for line in lines:
if line != url:
input_file.write(line)
# This function is just to show you what happens to the file after each scrape iteration
def printFileContents(input_file, i):
input_file.seek(0)
print("-----------------")
print("After iteration " + str(i) + ":\n")
print(input_file.read())
print("\n-----------------\n\n")
input_file.seek(0)
# main function
if __name__=="__main__":
with open("input.txt",'r+') as input_file:
# Count the lines and then reset the pointer to 0 position
line_count = len(input_file.readlines())
input_file.seek(0)
# While the file still contains url, execute an iteration of scraping
for x in range(0, line_count):
executeScrapeIteration(input_file)
printFileContents(input_file, x)
我的input.txt文件如下:
url1
url2
url3
只需復制/粘貼我的 python 腳本和 input.txt 文件,然后運行 python 腳本。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.