[英]Reading a pandas data frame having unequal columns in observations
I am trying to read this small data file, Link - https://drive.google.com/open?id=1nAS5mpxQLVQn9s_aAKvJt8tWPrP_DUiJ我正在尝试阅读这个小数据文件,链接 - https://drive.google.com/open?id=1nAS5mpxQLVQn9s_aAKvJt8tWPrP_DUiJ
I am using the code -我正在使用代码 -
df = pd.read_table('/Data/123451_date.csv', sep=';', index_col=0, engine='python', error_bad_lines=False)
It has ';'它有';' as a seprator, and values are missing in the file for some columns values in some observations (or rows).作为分隔符,文件中缺少某些观察(或行)中某些列值的值。
How can I read it properly.我怎样才能正确阅读它。 I see the current dataframe, which is not loaded properly.我看到当前的数据框,它没有正确加载。
It looks like the data you use has some garbage in it.看起来您使用的数据中有一些垃圾。 Precisely, rows 1-33 (inclusive) have additional, unnecessary (non-GPS) information included.准确地说,第 1-33 行(含)包含额外的、不必要的(非 GPS)信息。 You can either fix the database by manually removing the unneeded information from the datasheet, or use following code snippet to skip the rows that include it:您可以通过从数据表中手动删除不需要的信息来修复数据库,也可以使用以下代码片段跳过包含它的行:
from pandas import read_table
data = read_table('34_2017-02-06.gpx.csv', sep=';', skiprows=list(range(1, 34)).drop("Unnamed: 28", axis=1)
The drop("Unnamed: 28", axis=1)
is simply there to remove an additional column that is created probably due to each row in your datasheet ending with a ;
drop("Unnamed: 28", axis=1)
只是为了删除可能由于数据表中的每一行以;
结尾而创建的附加列;
(because it reads the empty space at the end of each line as data). (因为它将每行末尾的空白读取为数据)。
The result of print(data.head())
is then as follows: print(data.head())
的结果如下:
index cumdist ele ... esttotalpower lat lon
0 49 340 -34.8 ... 9 52.077362 5.114530
1 51 350 -34.8 ... 17 52.077468 5.114543
2 52 360 -35.0 ... -54 52.077521 5.114551
3 53 370 -35.0 ... -173 52.077603 5.114505
4 54 380 -34.8 ... 335 52.077677 5.114387
[5 rows x 28 columns]
To explain the role of the drop
command even more, here is what would happen without it (notice the last, weird column)为了进一步解释drop
命令的作用,这里是没有它会发生什么(注意最后一个奇怪的列)
index cumdist ele ... lat lon Unnamed: 28
0 49 340 -34.8 ... 52.077362 5.114530 NaN
1 51 350 -34.8 ... 52.077468 5.114543 NaN
2 52 360 -35.0 ... 52.077521 5.114551 NaN
3 53 370 -35.0 ... 52.077603 5.114505 NaN
4 54 380 -34.8 ... 52.077677 5.114387 NaN
[5 rows x 29 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.