![](/img/trans.png)
[英]Python: Trying to write to .txt only lines that contains specific word, instead of the whole text
[英]Trying to add code that extract only lines that contains "word" and write a new .txt file from requests
此代碼打開一個包含網站的文本文件 ( list.txt
),然后從這些網站的 webarchive.org 中提取 URLS,並將它們寫入一個新的文本文件 ( urls.txt
)。 我只需要從 web.archive.org 中提取包含“word”的鏈接,但出現錯誤:
if `word' in url: IndentationError: unexpected indent
有人可以解釋原因並在此處提供正確的代碼嗎?
代碼:
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
if 'word' in url:
print(r.text, file=f_out)
print("\n", file=f_out)
有兩個問題:
if
語句前有一個前導空格這應該可以解決您的問題:
urls = []
with open("list.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("url.txt", "w") as f_out:
for url in urls:
r = requests.get(archive_url.format(url))
if 'word' in url:
print(r.text, file=f_out)
print("\n", file=f_out)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.