简体   繁体   中英

How do i get a list of search response URLS

from bs4 import BeautifulSoup
import requests

searchresults = []
search = 'seo'
url = 'https://www.google.com/search'

headers = {
    'Accept' : '*/*',
    'Accept-Language': 'en-US,en;q=0.5',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82',
}
parameters = {'q': search}

content = requests.get(url, headers = headers, params = parameters).text
soup = BeautifulSoup(content, 'html.parser')

search = soup.find(id = 'search')
first_link = search.find('a')

searchresults.append(first_link['href'])

for i,j in enumerate(searchresults):
    print(searchresults[i])

How do i return the whole search result URL list? I would like to later on add multiple pages soo i can index all the URLs

If you want to get all the links from the search result, replace your code after search = soup.find(id = 'search') :

a_tags = search.find_all('a', href=True)

searchresults = [i['href'] for i in a_tags]

for i,j in enumerate(searchresults):
    print(j)

Your code currently gives one 1 link because you are using search.find('a') which gives the first result, instead of search.find_all('a', href=True) , which gives all the a tags that have a link.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM