简体   繁体   中英

Parsing span with beautiful soup

I am trying to parse through some website trying to find 'span' which is inside a div tag and a class_. Span is in a particular class if equal to string eg 'line' then it return actual link of the website.

error message that I am getting:

line 28, in soup = BeautifulSoup(url_html,"html.parser") line 245, in init elif len(markup) <= 256 and ( TypeError: object of type 'Response' has no len()

import csv
from bs4 import BeautifulSoup
import requests

contents = []

def condition_check():
    for sp in soup.find("div",class_='-vDIg'):
        check = sp.span
        if check in ['Line','LINE ID',]:
            return link


filename = 'link_business_filter.csv'

with(open(filename,'rt')) as f:
    data = csv.reader(f)

    for row in data:
        links = row[0]
        contents.append(links)



for link in contents:
    url_html = requests.get(link)
    soup = BeautifulSoup(url_html,"html.parser")
    con_fltr = condition_check()
    print(con_fltr)

You are passing the Request object to Beautiful Soup, you need to pass the html content like this:

for link in contents:
    url_html = requests.get(link)
    soup = BeautifulSoup(url_html.content,"html.parser")
    con_fltr = condition_check()
    print(con_fltr)

url_html --> url_html.content

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM