简体   繁体   English

用美丽的汤解析跨度

[英]Parsing span with beautiful soup

I am trying to parse through some website trying to find 'span' which is inside a div tag and a class_. 我正在尝试通过某些网站来解析,以查找div标签和class_内的“ span”。 Span is in a particular class if equal to string eg 'line' then it return actual link of the website. 如果跨度等于字符串,例如“ line”,则跨度属于特定类,则它将返回网站的实际链接。

error message that I am getting: 我收到的错误消息:

line 28, in soup = BeautifulSoup(url_html,"html.parser") line 245, in init elif len(markup) <= 256 and ( TypeError: object of type 'Response' has no len() 第28行,在汤中= BeautifulSoup(url_html,“ html.parser”)第245行,在init elif len(markup)<= 256中,并且(TypeError:类型为'Response'的对象没有len()

import csv
from bs4 import BeautifulSoup
import requests

contents = []

def condition_check():
    for sp in soup.find("div",class_='-vDIg'):
        check = sp.span
        if check in ['Line','LINE ID',]:
            return link


filename = 'link_business_filter.csv'

with(open(filename,'rt')) as f:
    data = csv.reader(f)

    for row in data:
        links = row[0]
        contents.append(links)



for link in contents:
    url_html = requests.get(link)
    soup = BeautifulSoup(url_html,"html.parser")
    con_fltr = condition_check()
    print(con_fltr)

You are passing the Request object to Beautiful Soup, you need to pass the html content like this: 您正在将Request对象传递给Beautiful Soup,您需要像这样传递html内容:

for link in contents:
    url_html = requests.get(link)
    soup = BeautifulSoup(url_html.content,"html.parser")
    con_fltr = condition_check()
    print(con_fltr)

url_html --> url_html.content url_html-> url_html.content

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM