Python 网页抓取与美丽的汤

Question

I'm trying to pull the entire table from the site below and store as a dataframe, but am hitting an error when attempting to pull all the headings.我正在尝试从下面的站点中提取整个表格并将其存储为数据框，但是在尝试提取所有标题时遇到错误。 It appears that the table has these attributes, so not sure why this is happening.该表似乎具有这些属性，因此不确定为什么会发生这种情况。

URL = "http://www.ercot.com/content/cdr/html/real_time_spp"
page = requests.get(URL).text
soup = BeautifulSoup(page, "lxml")

table = soup.find("table", attrs={"class": "tableStyle"})
table_data = table.tbody.find_all("tr")


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-241-362ee5fb0444> in <module>
      1 table = soup.find("table", attrs={"class": "tableStyle"})
----> 2 table_data = table.tbody.find_all("tr")

AttributeError: 'NoneType' object has no attribute 'find_all'

Answer 1

The HTML for that page doesn't have a tbody element, which is why table.tbody is None .该页面的 HTML 没有tbody元素，这就是table.tbody为None 。

You can get all the rows directly from the table using:您可以使用以下命令直接从表中获取所有行：

table = soup.find("table", attrs={"class": "tableStyle"})
table_data = table.findAll('tr')

Python 网页抓取与美丽的汤

问题描述

1 个解决方案

解决方案1
0 2021-10-20 15:16:00

Python 网页抓取与美丽的汤

问题描述

1 个解决方案

解决方案1 0 2021-10-20 15:16:00

解决方案1
0 2021-10-20 15:16:00