[英]Is there a simple way to readlines from text file to this beautiful soup lib python script?
How can I read lines from a txt.file into this script instead of having to list urls inside the script?如何将 txt.file 中的行读入此脚本,而不必在脚本中列出 url? Thank you
谢谢
from bs4 import BeautifulSoup
import requests
url = "http://www.url1.com"
response = requests.get(url)
data = response.text
soup = BeautifulSoup(data, 'html.parser')
categories = soup.find_all("a", {"class":'navlabellink nvoffset nnormal'})
for category in categories:
print(url + "," + category.text)
My text.file contents have a separator of a newline:我的 text.file 内容有一个换行符的分隔符:
http://www.url1.com
http://www.url2.com
http://www.url3.com
http://www.url4.com
http://www.url5.com
http://www.url6.com
http://www.url7.com
http://www.url8.com
http://www.url9.com
file1 = open('text.file', 'r')
Lines = file1.readlines()
count = 0
# Strips the newline character
for line in Lines:
print("Line{}: {}".format(count, line.strip()))
and you just replace your line by url variable你只需用 url 变量替换你的行
To read URLs from a.txt
, you can use this script:要从
a.txt
读取 URL,您可以使用以下脚本:
import requests
from bs4 import BeautifulSoup
with open('a.txt', 'r') as f_in:
for line in map(str.strip, f_in):
if not line:
continue
response = requests.get(line)
data = response.text
soup = BeautifulSoup(data, 'html.parser')
categories = soup.find_all("a", {"class":'navlabellink nvoffset nnormal'})
for category in categories:
print(url + "," + category.text)
For the sake of this example, let's say that your file is named urls.txt
.为了这个示例,假设您的文件名为
urls.txt
。 In Python, it is very easy to open a file and read it's contents.在 Python 中,打开文件并读取其内容非常容易。
with open('urls.txt', 'r') as f:
urls = f.read().splitlines()
#Your list of URLs is now in the urls list!
The 'r'
after 'urls.txt'
simply tells Python to just open the file in reading mode. 'urls.txt'
后面的'r'
' 只是告诉 Python 以阅读模式打开文件。 If you don't need to modify a file, it is always best practice to open it in read-only mode.如果您不需要修改文件,最好以只读模式打开它。 f.read() returns the entire contents of the file, but it contains newline characters (
\n
), so splitlines()
will remove those characters and create a list for you. f.read() 返回文件的全部内容,但它包含换行符 (
\n
),因此splitlines()
将删除这些字符并为您创建一个列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.