[英]Pandas read_csv not splitting columns according to the separator
I have data from John Hopkins Github.我有来自约翰霍普金斯 Github 的数据。 https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
I want to import the data using the following command:我想使用以下命令导入数据:
data = pd.read_csv('JohnHopkins.csv', sep=',')
However it is not possible for me to seperate the columns.但是,我无法将列分开。
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,2/22/20,2/23/20,2/24/20,2/25/20,2/26/20,2/27/20,2/28/20,2/29/20,3/1/20,3/2/20,3/3/20,3/4/20,3/5/20,3/6/20,3/7/20,3/8/20,3/9/20,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20,3/20/20,3/21/20,3/22/20,3/23/20,3/24/20,3/25/20,3/26/20,3/27/20,3/28/20,3/29/20,3/30/20,3/31/20,4/1/20,4/2/20,4/3/20,4/4/20,4/5/20,4/6/20,4/7/20,4/8/20,4/9/20,4/10/20,4/11/20,4/12/20,4/13/20,4/14/20,4/15/20,4/16/20,4/17/20,4/18/20,4/19/20,4/20/20,4/21/20,4/22/20,4/23/20,4/24/20,4/25/20,4/26/20,4/27/20,4/28/20,4/29/20,4/30/20,5/1/20,5/2/20,5/3/20,5/4/20,5/5/20,5/6/20,5/7/20,5/8/20,5/9/20,5/10/20,5/11/20,5/12/20,5/13/20,5/14/20,5/15/20,5/16/20,5/17/20,5/18/20,5/19/20,5/20/20,5/21/20,5/22/20,5/23/20,5/24/20,5/25/20,5/26/20,5/27/20,5/28/20,5/29/20,5/30/20,5/31/20,6/1/20,6/2/20,6/3/20,6/4/20,6/5/20,6/6/20,6/7/20,6/8/20,6/9/20,6/10/20,6/11/20,6/12/20,6/13/20,6/14/20,6/15/20,6/16/20,6/17/20,6/18/20,6/19/20,6/20/20,6/21/20,6/22/20,6/23/20,6/24/20,6/25/20,6/26/20,6/27/20,6/28/20,6/29/20,6/30/20,7/1/20,7/2/20,7/3/20,7/4/20,7/5/20,7/6/20,7/7/20,7/8/20,7/9/20,7/10/20,7/11/20,7/12/20,7/13/20,7/14/20,7/15/20,7/16/20,7/17/20,7/18/20,7/19/20,7/20/20,7/21/20,7/22/20,7/23/20,7/24/20,7/25/20,7/26/20,7/27/20,7/28/20,7/29/20,7/30/20,7/31/20,8/1/20,8/2/20,8/3/20,8/4/20,8/5/20,8/6/20,8/7/20,8/8/20,8/9/20,8/10/20,8/11/20,8/12/20,8/13/20,8/14/20,8/15/20,8/16/20,8/17/20,8/18/20,8/19/20,8/20/20,8/21/20,8/22/20,8/23/20,8/24/20,8/25/20,8/26/20,8/27/20,8/28/20,8/29/20,8/30/20,8/31/20,9/1/20,9/2/20,9/3/20,9/4/20,9/5/20,9/6/20,9/7/20,9/8/20,9/9/20,9/10/20,9/11/20,9/12/20,9/13/20
0 ,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,0,...
1 ,Albania,41.1533,20.1683,0,0,0,0,0,0,0,0,0,0,0...
2 ,Algeria,28.0339,1.6596,0,0,0,0,0,0,0,0,0,0,0,...
3 ,Andorra,42.5063,1.5218,0,0,0,0,0,0,0,0,0,0,0,...
4 ,Angola,-11.2027,17.8739,0,0,0,0,0,0,0,0,0,0,0...
Do you know a workaround?你知道解决方法吗?
I cannot reproduce the issue.我无法重现该问题。 Here I'm importing directly the raw csv file from the github link the OP provided.
在这里,我直接从 OP 提供的 github 链接导入原始 csv 文件。
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv", sep=",")
## df.head()
## Out[61]:
## Province/State Country/Region Lat Long 1/22/20 1/23/20 1/24/20 ... 9/7/20 9/8/20 9/9/20 9/10/20 9/11/20 9/12/20 9/13/20
## 0 NaN Afghanistan 33.93911 67.709953 0 0 0 ... 38494 38520 38544 38572 38606 38641 38716
## 1 NaN Albania 41.15330 20.168300 0 0 0 ... 10406 10553 10704 10860 11021 11185 11353
## 2 NaN Algeria 28.03390 1.659600 0 0 0 ... 46653 46938 47216 47488 47752 48007 48254
## 3 NaN Andorra 42.50630 1.521800 0 0 0 ... 1261 1261 1301 1301 1344 1344 1344
## 4 NaN Angola -11.20270 17.873900 0 0 0 ... 2981 3033 3092 3217 3279 3335 3388
## [5 rows x 240 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.