简体   繁体   English

熊猫 read_html - 没有找到表格

[英]pandas read_html - no tables found

I am attempting to see if I can read a table of data from WU.com, but I am getting a type error for no tables found.我正在尝试查看是否可以从 WU.com 读取数据表,但由于找不到表而出现类型错误。 (first timer on web scraping too here) There is also another person with a very similar stackoverflow question here with WU table of data, but the solution is a little bit complicated to me. (这里也是网络抓取的第一个计时器)还有另一个人在这里有一个非常相似的 stackoverflow 问题其中包含 WU 数据表,但解决方案对我来说有点复杂。

import pandas as pd

df_list = pd.read_html('https://www.wunderground.com/history/daily/us/wi/milwaukee/KMKE/date/2013-6-26')

print(df_list)

On the webpage of historical data for Milwaukee , this is the table of data ( daily observations ) that I am attempting to retrieve into Pandas: 在 Milwaukee 的历史数据网页上,这是我试图检索到 Pandas 的数据表( daily observations ): 在此处输入图片说明

Any tips help, thank you.任何提示都有帮助,谢谢。

the page is dynamic which means you'll need to to render the page first.页面是动态的,这意味着您需要先呈现页面。 So you would need to use something like Selenium to render the page, then you can pull the table using pandas .read_html() :因此,您需要使用 Selenium 之类的东西来呈现页面,然后您可以使用 pandas .read_html()拉出表格:

from selenium import webdriver
import pandas as pd


driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
driver.get("https://www.wunderground.com/history/daily/us/wi/milwaukee/KMKE/date/2013-6-26")

html = driver.page_source

tables = pd.read_html(html)
data = tables[1]

driver.close()

Output:输出:

print (data)
        Time Temperature      ...       Precip Accum      Condition
0    6:52 PM        68 F      ...             0.0 in  Mostly Cloudy
1    7:52 PM        69 F      ...             0.0 in  Mostly Cloudy
2    8:52 PM        70 F      ...             0.0 in  Mostly Cloudy
3    9:52 PM        67 F      ...             0.0 in         Cloudy
4   10:52 PM        65 F      ...             0.0 in  Partly Cloudy
5   11:42 PM        66 F      ...             0.0 in  Mostly Cloudy
6   11:52 PM        68 F      ...             0.0 in  Mostly Cloudy
7   12:08 AM        68 F      ...             0.0 in         Cloudy
8   12:52 AM        68 F      ...             0.0 in  Mostly Cloudy
9    1:52 AM        70 F      ...             0.0 in         Cloudy
10   2:13 AM        70 F      ...             0.0 in         Cloudy
11   2:52 AM        71 F      ...             0.0 in         Cloudy
12   3:52 AM        70 F      ...             0.0 in  Mostly Cloudy
13   4:19 AM        70 F      ...             0.0 in         Cloudy
14   4:29 AM        70 F      ...             0.0 in         Cloudy
15   4:52 AM        70 F      ...             0.0 in         Cloudy
16   5:25 AM        70 F      ...             0.0 in  Mostly Cloudy
17   5:52 AM        71 F      ...             0.0 in         Cloudy
18   6:52 AM        73 F      ...             0.0 in         Cloudy
19   7:52 AM        74 F      ...             0.0 in         Cloudy
20   8:52 AM        73 F      ...             0.0 in         Cloudy
21   9:52 AM        71 F      ...             0.0 in         Cloudy
22  10:52 AM        71 F      ...             0.0 in         Cloudy
23  11:52 AM        70 F      ...             0.0 in         Cloudy
24  12:52 PM        72 F      ...             0.0 in  Mostly Cloudy
25   1:52 PM        70 F      ...             0.0 in  Mostly Cloudy
26   2:52 PM        71 F      ...             0.0 in  Mostly Cloudy
27   3:52 PM        71 F      ...             0.0 in  Partly Cloudy
28   4:52 PM        68 F      ...             0.0 in  Mostly Cloudy
29   5:52 PM        66 F      ...             0.0 in  Mostly Cloudy

[30 rows x 11 columns]

如果你想访问一个不存在的文件,还要检查你的文件名是否正确,你会得到同样的错误“没有找到表”我用 X.htm 犯了我的错误,正在查看 X.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM