简体   繁体   中英

Best way to scrape data in Python from this website?

I am trying to scrape the price of the coin from the link below using the following Python Script, however I am running into some issues. If you could identify where I am going wrong then that would be great!

URL's for various coins from various websites are stored in a Python Dictionary called coins. A try, except, if stack is used to find the appropiate method to get the price. For some reason I am unable to exract the price from the following URL:

https://www.hattongardenmetals.com/buy/2020-gold-britannia-2

The script being used is:

    for coin in prices:
    response = requests.get(prices[coin]["url"])
    soup = BeautifulSoup(response.text, 'html.parser')

    try:
        text_price = soup.find(
            'td', {'id': 'price-inc-vat-per-unit-1'}).get_text()        
    except:
        text_price = soup.find(
            'td', {'id': 'total-price-inc-vat-1'}).get_text()            
    else:
        text_price = soup.find(
            'span', {'class': 'woocommerce-Price-amount amount'}).get_text       #<------

For some reason the line highlighted by the arrow keeps getting this error:

AttributeError: 'NoneType' object has no attribute 'get_text'

How can I fix this and how does you fix integrate with the code provided?

The price is inside <span> tag with class="h2 d-block" :

import requests
from bs4 import BeautifulSoup


url = 'https://www.hattongardenmetals.com/buy/2020-gold-britannia-2'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

print( soup.select_one('span.h2').text )

Prints:

£1584.31

try putting a parenthesis.. after get_text text_price = soup.find('span', {'class': 'woocommerce-Price-amount amount'}).get_text()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM