从这个网站上抓取 Python 中数据的最佳方法是什么？

Question

我正在尝试使用以下 Python 脚本从下面的链接中获取硬币的价格，但是我遇到了一些问题。 如果您能确定我哪里出错了，那就太好了！

来自各种网站的各种硬币的 URL 存储在 Python 字典中，称为硬币。 尝试，除非，如果堆栈用于找到适当的方法来获取价格。 由于某种原因，我无法从以下 URL 中提取价格：

https://www.hattongardenmetals.com/buy/2020-gold-britannia-2

正在使用的脚本是：

    for coin in prices:
    response = requests.get(prices[coin]["url"])
    soup = BeautifulSoup(response.text, 'html.parser')

    try:
        text_price = soup.find(
            'td', {'id': 'price-inc-vat-per-unit-1'}).get_text()        
    except:
        text_price = soup.find(
            'td', {'id': 'total-price-inc-vat-1'}).get_text()            
    else:
        text_price = soup.find(
            'span', {'class': 'woocommerce-Price-amount amount'}).get_text       #<------

由于某种原因，箭头突出显示的行不断收到此错误：

AttributeError: 'NoneType' object has no attribute 'get_text'

我该如何解决这个问题以及如何修复与提供的代码集成？

Answer 1

价格在<span>标签内，带有class="h2 d-block" ：

import requests
from bs4 import BeautifulSoup


url = 'https://www.hattongardenmetals.com/buy/2020-gold-britannia-2'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

print( soup.select_one('span.h2').text )

印刷：

£1584.31

Answer 2

尝试在 get_text text_price = soup.find('span', {'class': 'woocommerce-Price-amount amount'}).get_text() 之后加上括号

从这个网站上抓取 Python 中数据的最佳方法是什么？

问题描述

2 个解决方案

解决方案1
2 2020-07-18 17:32:26

解决方案2
1 2020-07-18 17:36:35

从这个网站上抓取 Python 中数据的最佳方法是什么？

问题描述

2 个解决方案

解决方案1 2 2020-07-18 17:32:26

解决方案2 1 2020-07-18 17:36:35

解决方案1
2 2020-07-18 17:32:26

解决方案2
1 2020-07-18 17:36:35