简体   繁体   中英

SyntaxError while scraping Google with BeautifulSoup

I am scraping google search results. However, I repeatedly get a SyntaxError while doing it. Here's the code:

import urllib.request
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0'

url = "https://www.google.com/search?hl=en&q=python+wikipedia"
headers={'User-Agent':user_agent,} 

request=urllib.request.Request(url,None,headers) #The assembled request
response = urllib.request.urlopen(request)
data = response.read()

soup= BeautifulSoup(data, 'html.parser')
l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})
print(l)

I get:

SyntaxError: keyword can't be an expression

in the line l = soup.find_all('h', 'attrs' = {"class":'LC20lb'}) . Can someone please tell me what I'm doing wrong?

There should not be the apostrophes around attrs:

l = soup.find_all('h' ,   attrs  = {"class":'LC20lb'})
# not:                   _     _
#l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})    
#                        ^     ^
import urllib.request from bs4 import BeautifulSoup user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0' url = "https://www.google.com/search?hl=en&q=python+wikipedia" headers={'User-Agent':user_agent,} request=urllib.request.Request(url,None,headers) #The assembled request response = urllib.request.urlopen(request) data = response.read() soup= BeautifulSoup(data, 'html.parser') l = soup.find_all('h', {"class":'LC20lb'}) print(l)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM