简体   繁体   English

如何使用 Beautiful Soup 从网站获取不断变化的数据值?

[英]How can I get the Changing Data Values from website with Beautiful Soup?

I am trying web scraping with BeautifulSoup for getting data of BTC-USDT from biance.我正在尝试使用 web 和BeautifulSoup从 biance 获取 BTC-USDT 数据。 Actually I am getting what I want but the value is changing in every second in website,but when I am trying to print it to my console it prints me same value and it change rarely.实际上我得到了我想要的,但是网站上的值每秒都在变化,但是当我试图将它打印到我的控制台时,它打印出相同的值并且它很少改变。 Basically, my data are the same every time when I try to get it, but on the website, it changes every time and I can't get that changing data.基本上,每次我尝试获取数据时,我的数据都是相同的,但在网站上,它每次都在变化,我无法获取不断变化的数据。 What can I do?我能做些什么?

from bs4 import BeautifulSoup
import requests
import time



while(True):
    url='https://www.binance.com/tr/trade/BTC_USDT'
    HTML=requests.get(url)
    html_content=HTML.content
    soup=BeautifulSoup(HTML.text,'html.parser')
    paper=str((soup.find('title',attrs={'data-shuvi-head':'true'})))
    print(paper)
    time.sleep(5)

This page uses JavaScript to update data but BeautifulSoup can't run JavaScript .此页面使用JavaScript更新数据,但BeautifulSoup无法运行JavaScript You use need Selenium to control real web browser which can run JavaScript .您使用需要Selenium来控制真正的 web 浏览器,它可以运行JavaScript

from selenium import webdriver
import time
             
url = 'https://www.binance.com/tr/trade/BTC_USDT'  # PEP8: spaces around `=`

#driver = webdriver.Chrome()
driver = webdriver.Firefox()
driver.get(url)

while True:  # PEP8: no need `()`
    try:
        #print(driver.title)
        print(driver.title.split(' ')[0].strip())
    except Exception as ex:
        print('Exception:', ex)
        
    time.sleep(5)

Eventually you can check in DevTools (tab Network ) in Chrome / Firefox to see url used by JavaScript to get new data - and then you can try to use it with requests .最后,您可以在Chrome / Firefox中的DevTools (选项卡Network )中查看 JavaScript 使用JavaScript获取新数据 - 然后您可以尝试将其与requests一起使用。 Because JavaScript usually send data as JSON so you will no need BeautifulSoup but module json .因为JavaScript通常发送数据为JSON所以你不需要BeautifulSoup但模块json

But first check if you can get it with official Binance API但首先检查你是否可以通过官方Binance API获得它


PEP 8 -- Style Guide for Python Code PEP 8 -- Python 代码风格指南


EDIT编辑

Example with Binance API: Current Average Price Binance 示例 API: 当前平均价格

import requests
import time

url = 'https://api.binance.com/api/v3/avgPrice?symbol=BTCUSDT'

while True:
    response = requests.get(url)
    data = response.json() 
    print(data['price'])
    
    time.sleep(5)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM