簡體   English   中英

如何使用BeautifulSoup從Yahoo Finance獲取現貨價格

[英]How to grab spot price from yahoo finance using BeautifulSoup

我正在嘗試獲取SPY ETF的現貨價格: https : //finance.yahoo.com/quote/SPY/options

我主要嘗試通過使用嵌套的“ div”標簽來使用soup.find_all:

    from bs4 import BeautifulSoup
    import urllib.request

    url = 'https://finance.yahoo.com/quote/SPY/options/'
    source = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(source,'lxml')

    for div in soup.find_all('div', class_ = "My(6px) smartphone_Mt(15px)"):
        print(div.text)

    for div in soup.find_all('div', class_ = "D(ib) Maw(65%) Ov(h)"):
        print(div.text)

    for div in soup.find_all('div', class_ = "D(ib) Mend(20px)"):
        print(div.text)

什么都沒有打印。 我還嘗試了以下方法:

    print(soup.find('span', attrs = {'data-reactid':"35"}).text)

這導致“最后價格”被打印出來。 現在顯然我想要的是最后價格,而不是“最后價格”一詞,但這已經接近了。

嵌套在該span標簽中的是一些html,其中包含我想要的數字。 我猜想正確的答案與span標記內的“ react text:36”有關(在沒有stackoverflow認為我試圖將html實際實現到這個問題的情況下,無法鍵入它)。

我建議您使用scrapy,請求模塊

import requests
from bs4 import BeautifulSoup
from scrapy.selector import Selector

ajanlar = [
'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.4; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)']
url = "https://finance.yahoo.com/quote/SPY/options"

headers = {"User-Agent":random.choice(ajanlar)}
response = requests.get(url,headers=headers,proxies=None)
soup = BeautifulSoup(response.text, 'lxml')

xpath1 = "normalize-space(//div[@class='Mt(6px) smartphone_Mt(15px)'])"
xpath2 = "normalize-space(//div[@class='D(ib) Maw(65%) Maw(70%)--tab768 Ov(h)'])"
xpath3 = "normalize-space(//div[@class='D(ib) Mend(20px)'])"

var1 = Selector(text=response.text).xpath(xpath1).extract()[0]
var2 = Selector(text=response.text).xpath(xpath2).extract()[0]
var3 = Selector(text=response.text).xpath(xpath3).extract()[0]

print(var1)
print(var2)
print(var3)

輸出:

269.97-1.43 (-0.53%)At close: 4:00PM EST269.61 -0.44 (-0.16%)After hours: 6:08PM ESTPeople also watchDIAIWMQQQXLFGLD
269.97-1.43 (-0.53%)At close: 4:00PM EST269.61 -0.44 (-0.16%)After hours: 6:08PM EST
269.97-1.43 (-0.53%)At close: 4:00PM EST

之后,您可以申請正則表達式

如果您只想要價格:

import urllib.request
from bs4 import BeautifulSoup, Comment

page = urllib.request.urlopen("https://finance.yahoo.com/quote/SPY?p=SPY")
content = page.read().decode('utf-8')
soup = BeautifulSoup(content, 'html.parser')
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
price = soup.find("span", {"data-reactid": "14", "class" : "Trsdu(0.3s) "}).text
print(price)

輸出:

271.40

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM