[英]Parsing span with beautiful soup
I am trying to parse through some website trying to find 'span' which is inside a div tag and a class_. 我正在尝试通过某些网站来解析,以查找div标签和class_内的“ span”。 Span is in a particular class if equal to string eg 'line' then it return actual link of the website.
如果跨度等于字符串,例如“ line”,则跨度属于特定类,则它将返回网站的实际链接。
error message that I am getting: 我收到的错误消息:
line 28, in soup = BeautifulSoup(url_html,"html.parser") line 245, in init elif len(markup) <= 256 and ( TypeError: object of type 'Response' has no len()
第28行,在汤中= BeautifulSoup(url_html,“ html.parser”)第245行,在init elif len(markup)<= 256中,并且(TypeError:类型为'Response'的对象没有len()
import csv
from bs4 import BeautifulSoup
import requests
contents = []
def condition_check():
for sp in soup.find("div",class_='-vDIg'):
check = sp.span
if check in ['Line','LINE ID',]:
return link
filename = 'link_business_filter.csv'
with(open(filename,'rt')) as f:
data = csv.reader(f)
for row in data:
links = row[0]
contents.append(links)
for link in contents:
url_html = requests.get(link)
soup = BeautifulSoup(url_html,"html.parser")
con_fltr = condition_check()
print(con_fltr)
You are passing the Request object to Beautiful Soup, you need to pass the html content like this: 您正在将Request对象传递给Beautiful Soup,您需要像这样传递html内容:
for link in contents:
url_html = requests.get(link)
soup = BeautifulSoup(url_html.content,"html.parser")
con_fltr = condition_check()
print(con_fltr)
url_html --> url_html.content url_html-> url_html.content
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.