简体   繁体   English

如何从数据框中切片唯一的日期数据?

[英]How to slice unique dates data from a Dataframe?

Existing slicing methods on dates have commonly extracted the data between two dates but not distinctive dates. 现有的日期切片方法通常会提取两个日期之间的数据,但不是唯一的日期。 My problem is slicing the data of distinctive dates. 我的问题是切片不同日期的数据。 My dataframe is given by: 我的数据帧是由:

df =                           A         B
     2019-03-21 19:15:00   21.787958  16.728439  
     2019-03-25 19:16:00   20.983078  15.865983 
     2019-03-29 19:17:00   20.122042  15.073062  

I want to extract the data on days 21 and 29. My code is given below: Code1: 我想在第21天和第29天提取数据。我的代码如下:Code1:

df.index == ['2019-03-21','2019-03-29']

Output: 输出:

ValueError: Lengths must match

Code2: 代码2:

df['2019-03-21','2019-03-29']

Output: 输出:

KeyError: ('2019-03-21', '2019-03-25')

Could you help me to find the mistake here? 您能帮我在这里找到错误吗?

Few things going on here. 这里发生的事情很少。 First, when you compare one list to another with an "==", it doesn't necessarily return an elementwise comparison - you have to use pandas inbuilt 'isin' method. 首先,当您使用“ ==”将一个列表与另一个列表进行比较时,它不一定会返回逐元素比较-您必须使用pandas内置的“ isin”方法。

Second, when you pass a mask to a dataframe to filter it, the mask needs to have the same number of elements as the rows in the dataframe. 其次,当您将掩码传递给数据框以对其进行过滤时,掩码需要具有与数据框中的行数相同的元素数。

Third, you have an index that is a datetime, that you want to compare with a date - so you have to extract the date component first to compare. 第三,您有一个要与日期进行比较的日期时间索引,因此必须首先提取日期部分才能进行比较。

df=pd.DataFrame({'A':[21.787958,20.983078,20.122042], 'B':16.728439,15.865983,15.073062]})
df.index=pd.to_datetime(['2019-3-21 19:15:0','2019-3-25 19:16:0','2019-3-29 19:17:0'])

So here is the filtered dataframe: 所以这是过滤后的数据帧:

df[pd.to_datetime(df.index.date).isin(pd.to_datetime(['2019-03-21','2019-03-29']))]

You can do it using 您可以使用

df.loc[df['date_column'].isin(['2019-03-21','2019-03-29'])]  

or 要么

 df[(df['A'] == '2019-03-21') | (df['A'] ==  '2019-03-29')] 

the df.index that you wrote just gives the indexes of the rows. 您编写的df.index只是给出了行的索引。 for instance if you run: 例如,如果您运行:

for i in df.index: 
    print(i)

it returns 0,1,2 which are the rows indexes. 它返回0,1,2,这是行索引。 but what you are looking for is the content of a specific column which is df['date_column'] So you should compare the values in this column with what you are looking for. 但是您要查找的是特定列df['date_column']因此您应该将此列中的值与所需内容进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM