当日期和时间在单独的列中时，将数据从csv读入pandas

Question

I looked at the answer to this question: Parse dates when YYYYMMDD and HH are in separate columns using pandas in Python , but it doesn't seem to work for me, which makes me think I'm doing something subtley wrong. 我查看了这个问题的答案：在Python中使用pandas将YYYYMMDD和HH放在不同的列中时解析日期，但它似乎对我不起作用，这让我觉得我做了一些微妙的错误。

I've got data in .csv files, which I'm trying to read using the pandas read_csv function. 我有.csv文件中的数据，我正在尝试使用pandas read_csv函数读取。 Date and time are in two separate columns, but I want to merge in them into one column, "Datetime", containing datetime objects. 日期和时间分为两列，但我想将它们合并到一个包含datetime对象的列“Datetime”中。 The csv looks like this: csv看起来像这样：

    Note about the data
    blank line
    Site Id,Date,Time,WTEQ.I-1...
    2069, 2008-01-19, 06:00, -99.9...
    2069, 2008-01-19, 07:00, -99.9...
    ...

I'm trying to read it using this line of code: 我正在尝试使用以下代码行阅读它：

   read_csv("2069_ALL_YEAR=2008.csv", skiprows=2, parse_dates={"Datetime" : [1,2]}, date_parser=True, na_values=["-99.9"])

However, when I write it back out to a csv, it looks exactly the same (except that the -99.9s are changed to NA, like I specified with the na_values argument). 但是，当我把它写回到csv时，它看起来完全一样（除了-99.9s被改为NA，就像我用na_values参数指定的那样）。 Date and time are in two separate columns. 日期和时间分为两列。 As I understand it, this should be creating a new column called Datetime that is composed of columns 1 and 2, parsed using the date_parser. 据我了解，这应该是创建一个名为Datetime的新列，它由第1列和第2列组成，使用date_parser进行解析。 I have also tried using parse_dates={"Datetime" : ["Date","Time"]}, parse_dates=[[1,2]], and parse_dates=[["Date", "Time"]]. 我也尝试过使用parse_dates = {“Datetime”：[“Date”，“Time”]}，parse_dates = [[1,2]]和parse_dates = [[“Date”，“Time”]]。 I have also tried using date_parser=parse, where parse is defined as: 我也尝试使用date_parser = parse，其中parse定义为：

    parse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M')

None of these has made the least bit of difference, which makes me suspect that there's some deeper problem. 这些都没有造成一点点差别，这使我怀疑存在一些更深层次的问题。 Any insight into what it might be? 任何洞察它可能是什么？

Answer 1

You should update your pandas, I recommend the latest stable version for the latest features and bug fixes. 您应该更新您的熊猫，我推荐最新的稳定版本以获取最新功能和错误修复。

This specific feature was introduced in 0.8.0 , and works on pandas version 0.11: 此特定功能在0.8.0中引入，适用于pandas版本0.11：

In [11]: read_csv("2069_ALL_YEAR=2008.csv", skiprows=2, parse_dates={"Datetime" : [1,2]}, na_values=["-99.9"])
Out[11]:
             Datetime  Site Id  WTEQ.I-1
0 2008-01-19 06:00:00     2069       NaN
1 2008-01-19 07:00:00     2069       NaN

without the date_parser=True (since this should be a parsing function, see docstring ). 没有date_parser=True （因为这应该是一个解析函数，请参阅docstring ）。

Note that in the provided example the resulting "Datetime" column is a Series of its own and not the index values of the DataFrame. 请注意，在提供的示例中，生成的“Datetime”列是其自己的Series，而不是DataFrame的索引值。 If you'd rather want to have the datetime values as index column rather than the integer value pass the index_col argument specifying the desired column, in this case 0 since the resulting "Datetime" column is the first one. 如果您希望将datetime值作为索引列而不是整数值，则传递指定所需列的index_col参数，在本例中为0，因为生成的“Datetime”列是第一个。

In [11]: read_csv("2069_ALL_YEAR=2008.csv", skiprows=2, parse_dates={"Datetime" : [1,2]}, index_col=0, na_values=["-99.9"])

当日期和时间在单独的列中时，将数据从csv读入pandas

问题描述

1 个解决方案

解决方案1
3 已采纳 2013-07-05 17:02:07

当日期和时间在单独的列中时，将数据从csv读入pandas

问题描述

1 个解决方案

解决方案1 3 已采纳 2013-07-05 17:02:07

解决方案1
3 已采纳 2013-07-05 17:02:07