在 Python 中使用 BeautifulSoup 從谷歌搜索中檢索鏈接

Question

我正在使用 Tweepy 和 BeautifulSoup4 構建一個 Twitter 機器人。 我想將請求的結果保存在列表中，但我的腳本不再工作了（但它在幾天前工作）。 我一直在看它，我不明白。 這是我的 function：

import requests
import tweepy
from bs4 import BeautifulSoup
import urllib
import os
from tweepy import StreamListener
from TwitterEngine import TwitterEngine
from ConfigEngine import TwitterAPIConfig
import urllib.request
import emoji
import random

# desktop user-agent
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"
# mobile user-agent
MOBILE_USER_AGENT = "Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36"




# Récupération des liens
def parseLinks(url):
    headers = {"user-agent": USER_AGENT}
    resp = requests.get(url, headers=headers)
    if resp.status_code == 200:
        soup = BeautifulSoup(resp.content, "html.parser")
        results = []
        for g in soup.find_all('div', class_='r'):
            anchors = g.find_all('a')
            if anchors:
                link = anchors[0]['href']
                results.append(link)
        return results

“url”參數在代碼的 rest 中是 100% 正確的。 作為 output，我得到一個“無”。 更准確地說，執行在“results = []”行之后停止（因此它不會進入for）。

任何的想法？ 非常感謝您！

Answer 1

Google 似乎更改了頁面上的 HTML 標記。 嘗試將搜索從class="r"更改為class="rc" ：

import requests
from bs4 import BeautifulSoup


USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"

def parseLinks(url):
    headers = {"user-agent": USER_AGENT}
    resp = requests.get(url, headers=headers)
    if resp.status_code == 200:
        soup = BeautifulSoup(resp.content, "html.parser")
        results = []
        for g in soup.find_all('div', class_='rc'): # <-- change 'r' to 'rc'
            anchors = g.find_all('a')
            if anchors:
                link = anchors[0]['href']
                results.append(link)
        return results

url = 'https://www.google.com/search?q=tree'
print(parseLinks(url))

印刷：

['https://en.wikipedia.org/wiki/Tree', 'https://simple.wikipedia.org/wiki/Tree', 'https://www.britannica.com/plant/tree', 'https://www.treepeople.org/tree-benefits', 'https://books.google.sk/books?id=yNGrqIaaYvgC&pg=PA20&lpg=PA20&dq=tree&source=bl&ots=_TP8PqSDlT&sig=ACfU3U16j9xRJgr31RraX0HlQZ0ryv9rcA&hl=sk&sa=X&ved=2ahUKEwjOq8fXyKjsAhXhAWMBHToMDw4Q6AEwG3oECAcQAg', 'https://teamtrees.org/', 'https://www.woodlandtrust.org.uk/trees-woods-and-wildlife/british-trees/a-z-of-british-trees/', 'https://artsandculture.google.com/entity/tree/m07j7r?categoryId=other']

在 Python 中使用 BeautifulSoup 從谷歌搜索中檢索鏈接

問題描述

1 個解決方案

解決方案1
1 已采納 2020-10-09 22:35:11

在 Python 中使用 BeautifulSoup 從谷歌搜索中檢索鏈接

問題描述

1 個解決方案

解決方案1 1 已采納 2020-10-09 22:35:11

解決方案1
1 已采納 2020-10-09 22:35:11