熊猫read_excel

Question

我挣扎了几个小时如何使用 pd.read_excel 读取 excel 文件，其中路径是网站地址。 我发现该链接不会直接转到文件，而只会触发下载。 有什么简单的方法可以解决吗？

部分代码：

link_energy = 'http://unstats.un.org/unsd/environment/excel_file_tables/2013/Energy%20Indicators.xls'
df_energy = pd.read_excel(link_energy)

错误信息：

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\n\n\n<!DOC'

可能这不是熊猫的问题，但我缺乏技能怎么办。

Answer 1

对我来说，在以下代码中按预期工作：

import pandas as pd
link_energy = 'http://unstats.un.org/unsd/environment/excel_file_tables/2013/Energy%20Indicators.xls'
df_energy = pd.read_excel(link_energy)
df_energy

在以下环境中没有错误：

笔记本服务器的版本是：5.2.2 服务器运行在这个版本的Python上：

Python 3.6.3 | 由 conda-forge 打包 | （默认，2017 年 11 月 4 日，10:10:56）[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]

当前内核信息：

Python 3.6.3 | 由 conda-forge 打包 | （默认，2017 年 11 月 4 日，10:10:56）输入“版权”、“信用”或“许可证”以获取更多信息 IPython 6.2.1——增强的交互式 Python。 类型 '？' 求助。

Answer 2

但是我无法访问您发布的网址。

但是pd.read_excel不起作用，您需要使用pd.read_csv

import pandas as pd

df = pd.read_csv('https://cib.societegenerale.com/fileadmin/indices_feeds/CTA_Historical.xls')

现在您需要查看 excel 文件，其中包含使用的分隔符，如果任何列中有任何其他值，则需要跳过它以加载和读取有用的数据。

熊猫read_excel

问题描述

2 个解决方案

解决方案1
1 2018-02-14 20:00:52

解决方案2
0 2018-02-14 18:56:38

熊猫read_excel

问题描述

2 个解决方案

解决方案1 1 2018-02-14 20:00:52

解决方案2 0 2018-02-14 18:56:38

解决方案1
1 2018-02-14 20:00:52

解决方案2
0 2018-02-14 18:56:38