熊猫-标题中缺少值的read_csv

Question

I have this kind of csv file: 我有这种csv文件：

date,a,b,c
2014,12,29,7,12,45
2014,12,30,7,13,12
2014,12,31,6.5,6,5

So the first row does not explicitly specify all columns, and kind of assumes that you understand that the date is the first 3 columns. 因此，第一行未明确指定所有列，因此，假设您了解日期是前三列。

How do I tell read_csv to consider the first three columns as one date column (while keeping the other labels)? 如何告诉read_csv将前三列视为一个日期列（同时保留其他标签）？

Answer 1

You can parse your columns directly as a date, if you use the parse_dates argument. 如果使用parse_dates参数，则可以直接将列解析为日期。

From the docs : 从文档：

parse_dates : boolean, list of ints or names, list of lists, or dict, default False parse_dates：布尔值，整数或名称列表，列表或字典列表，默认为False

If True -> try parsing the index. 如果为True->尝试解析索引。 If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column. 如果[1、2、3]->尝试将第1、2、3列分别解析为单独的日期列。 If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column. 如果[[1，3]]->合并列1和3并解析为单个日期列。 {'foo' : [1, 3]} -> parse columns 1, 3 as date and call result 'foo' A fast-path exists for iso8601-formatted dates. {'foo'：[1，3]}->将第1、3列解析为日期，并调用结果'foo'。存在iso8601格式日期的快速路径。

For your file, you can do something like this: 对于您的文件，您可以执行以下操作：

pd.read_csv(file_path, names=['y', 'm', 'd', 'a', 'b', 'c'], header=0,
    parse_dates={'date': [0, 1, 2]}, index_col='date', )

              a   b   c
date                   
2014-12-29  7.0  12  45
2014-12-30  7.0  13  12
2014-12-31  6.5   6   5

The thing with the missing values in headline is solved by passing the names argument and header=0 (to overwrite the existing header). 标题中缺少值的东西可以通过传递names参数和header=0 （覆盖现有标题）来解决。 Then it is possible to specify which columns should be parsed as a date. 然后可以指定将哪些列解析为日期。

See another example here . 在这里查看另一个示例。

熊猫-标题中缺少值的read_csv

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-12-23 22:45:19

熊猫-标题中缺少值的read_csv

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-12-23 22:45:19

解决方案1
2 已采纳 2015-12-23 22:45:19