简体   繁体   English

如果网站上的元素发生变化,为什么我的 Beautifulsoup 元素保持不变?

[英]Why does my Beautifulsoup element stay the same if element on the website changes?

import requests, bs4
import time


def get_tesla_price():
    website = "https://finance.yahoo.com/quote/TSLA/"
    res = requests.get(website)
    soup = bs4.BeautifulSoup(res.text, 'html.parser')
    elem = soup.select("span[class='Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)']")

    price = elem[0].getText()
    print(price)

for _ in range(10):
    get_tesla_price()
    time.sleep(2)

Above is some code which should scrape the current value of the tesla stock.上面是一些代码,可以抓取特斯拉股票的当前价值。 If you run it you will see that price stays the same for all 10 function calls.如果您运行它,您将看到所有 10 个函数调用的price保持不变。 But if you check the website in website you see that the price of the stock changes almost every second.但是,如果您查看网站中的website您会发现股票价格几乎每秒都在变化。

Why, every time when i call my function, price is the same value?为什么每次我调用我的函数时, price都是相同的值?

This is because of your user agent, yahoo detects you are a robot, add user agent in headers:这是因为您的用户代理,雅虎检测到您是机器人,在标题中添加用户代理:

def get_tesla_price():
    website = "https://finance.yahoo.com/quote/TSLA/"
    res = requests.get(website, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'})
    soup = bs4.BeautifulSoup(res.text, 'html.parser')
    elem = soup.select("span[class='Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)']")

    price = elem[0].getText()
    res.close()
    print(price)

Or you can do it with just one module requests-html:或者你可以只用一个模块 requests-html 来完成:

from requests_html import HTMLSession
import time

session = HTMLSession()

def get_tesla_price():
    website = "https://finance.yahoo.com/quote/TSLA/"
    r = session.get(website)
    price = r.html.find("span[class='Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)']", first=True)
    print(price.text)
    

for _ in range(20):
    get_tesla_price()
    time.sleep(1)

output:输出:

682.66
682.66
682.53
682.53
682.53
682.65
682.62
682.69
682.65
682.71
682.71
682.83
682.83
682.83
682.83
682.79
682.79
682.79
682.79
682.80

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Beautifulsoup 获取与 class 相同的元素 - Beautifulsoup get element with the same class 使用 BeautifulSoup 从网站上抓取每个元素 - Scraping each element from website with BeautifulSoup 为什么<span>不包含 BeautifulSoup 中的文本,尽管<span>网站上的内容完全相同?</span></span> - Why <span> does not contain the text in BeautifulSoup despite the fact that exactly the same <span> from the website contains it? 为什么结果总是保持不变? - Why does the result always stay the same? 为什么从BeautifulSoup获得的HTML与检查元素时看到的HTML不一样? - Why isn't the HTML I get from BeautifulSoup the same as the one I see when I inspect element? 为什么我的Python beautifulsoup程序同时出现属性错误,然后同时进行自我修复? - Why does my Python beautifulsoup programs get an attribute error at the same time and then fix themselves at the same time? 为什么 BeautifulSoup 库一直只忽略一个特定元素? - Why does BeautifulSoup library keeps ignoring Only one specific <TR> element? 为什么我的计数值保持为0? - why does my count value stay 0? 为什么按钮在我的代码中保持按下状态? - Why does the button stay pressed in my code? 通过 Selenium 或 Python 上的 BeautifulSoup 在日本网站上查找特定元素 - Find Specific Element on Japanese Website via Selenium or BeautifulSoup on Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM