如何使用 Pandas 从 GitHub 读取 CSV 文件

Question

Im trying to read CSV file thats on github with Python using pandas> i have looked all over the web, and I tried some solution that I found on this website, but they do not work.我正在尝试使用 Pandas 使用 Python 读取 github 上的 CSV 文件What am I doing wrong?我究竟做错了什么？

I have tried this:我试过这个：

import pandas as pd

url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv'
df = pd.read_csv(url,index_col=0)
#df = pd.read_csv(url)

print(df.head(5))

Answer 1

You should provide URL to raw content.您应该提供原始内容的 URL。 Try using this:尝试使用这个：

import pandas as pd

url = 'https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv'
df = pd.read_csv(url, index_col=0)
print(df.head(5))

Output:输出：

               alpha-2           ...            intermediate-region-code
name                             ...                                    
Afghanistan         AF           ...                                 NaN
Åland Islands       AX           ...                                 NaN
Albania             AL           ...                                 NaN
Algeria             DZ           ...                                 NaN
American Samoa      AS           ...                                 NaN

Answer 2

Add ?raw=true at the end of the GitHub URL to get the raw file link.在 GitHub URL 末尾添加?raw=true以获取原始文件链接。

In your case,在你的情况下，

import pandas as pd
url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv?raw=true'
df = pd.read_csv(url,index_col=0)
#df = pd.read_csv(url)

print(df.head(5))

Note : This works only with GitHub links and not with GitLab or Bitbucket links.注意：这仅适用于 GitHub 链接，不适用于 GitLab 或 Bitbucket 链接。

Answer 3

I recommend to either use pandas as you tried to and others here have explained, or depending on the application, the python csv-handler CommaSeperatedPython , which is a minimalistic wrapper for the native csv-library.我建议要么像你尝试的那样使用熊猫，其他人在这里已经解释过，或者根据应用程序，python csv-handler CommaSeperatedPython ，它是原生 csv-library 的简约包装器。

The library returns the contents of a file as a 2-Dimensional String-Array.该库以二维字符串数组的形式返回文件的内容。 It's is in its very early stage though, so if you want to do large scale data-analysis, I would suggest Pandas.不过它还处于早期阶段，所以如果你想做大规模的数据分析，我会建议 Pandas。

Answer 4

You can copy/paste the url and change 2 things:您可以复制/粘贴网址并更改两件事：

Remove "blob"删除“斑点”
Replace github.com by raw.githubusercontent.com用 raw.githubusercontent.com 替换 github.com

For instance this link:例如这个链接：

https://github.com/mwaskom/seaborn-data/blob/master/iris.csv

Works this way:以这种方式工作：

import pandas as pd

pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')

Answer 5

First convert the github csv file to raw in order to access the data, follow the link below in comment on how to convert csv file to raw .首先将 github csv 文件转换为 raw 以访问数据，请按照下面的链接评论如何将 csv 文件转换为 raw 。

import pandas as pd

url_data = (r'https://raw.githubusercontent.com/oderofrancis/rona/main/Countries-Continents.csv')

data_csv = pd.read_csv(url_data)

data_csv.head()

如何使用 Pandas 从 GitHub 读取 CSV 文件

问题描述

5 个解决方案

解决方案1
25 已采纳 2019-03-19 11:50:14

解决方案2
6 2020-07-26 15:55:55

解决方案3
0 2019-12-18 20:28:23

解决方案4
0 2020-09-29 11:50:48

解决方案5
0 2021-04-26 18:41:13

如何使用 Pandas 从 GitHub 读取 CSV 文件

问题描述

5 个解决方案

解决方案1 25 已采纳 2019-03-19 11:50:14

解决方案2 6 2020-07-26 15:55:55

解决方案3 0 2019-12-18 20:28:23

解决方案4 0 2020-09-29 11:50:48

解决方案5 0 2021-04-26 18:41:13

解决方案1
25 已采纳 2019-03-19 11:50:14

解决方案2
6 2020-07-26 15:55:55

解决方案3
0 2019-12-18 20:28:23

解决方案4
0 2020-09-29 11:50:48

解决方案5
0 2021-04-26 18:41:13