简体   繁体   English

pandas set_index 和 reset_index 改变变量的类型

[英]pandas set_index and reset_index change the type of variable

Here is my pandas Dataframe这是我的 pandas Dataframe

import datetime as dt
d1=dt.date(2021,1,1)
d2=dt.date(2021,1,13)
df = pd.DataFrame({'date':[d1,d2],'location':['Paris','New York'],'Number':[2,3]})

Here is my problem这是我的问题

df = df.set_index(['date'])
df = df.reset_index()
print(df.loc[df.date == d1]) # Ok !
df = df.set_index(['location','date'])  # order has not importance
df = df.reset_index()
print(df.loc[df.date == d1]) # Not OK ! it returns an Empty DataFrame 

It seems that when I set_index with two columns and reset_index the type of date is lost.似乎当我用两列 set_index 和 reset_index 时,日期的类型丢失了。

I don't understand why it is working in the first case and not in the second, where I need to do pd.to_datetime(df['date']).dt.date to work again?我不明白为什么它在第一种情况下工作而不是在第二种情况下,我需要做 pd.to_datetime(df['date']).dt.date 再次工作?

Since it is change the column type to datetime64[ns] and the output for single value is Timestamp we need add date convert由于将列类型更改为datetime64[ns]并且单个值的 output 是Timestamp我们需要添加日期转换

df.date[0]
Out[175]: Timestamp('2021-01-01 00:00:00')

df.loc[df.date.dt.date == d1]
Out[177]: 
  location       date  Number
0    Paris 2021-01-01       2

To explain the difference, we can inspect the data type of the elements and see the difference.为了解释差异,我们可以检查元素的数据类型并查看差异。

Applying the type function on the dataframe at different stages:在不同阶段在 dataframe 上应用 function 类型:

df.applymap(type)

You can see from the following that the data type at the last stage is different.从下面可以看出,最后阶段的数据类型是不同的。 Hence, no match at the last stage.因此,最后阶段没有比赛。

Output: Output:

import datetime as dt
d1=dt.date(2021,1,1)
d2=dt.date(2021,1,13)
df = pd.DataFrame({'date':[d1,d2],'location':['Paris','New York'],'Number':[2,3]})

df.applymap(type)

                      date       location         Number
0  <class 'datetime.date'>  <class 'str'>  <class 'int'>      # <=== datetime.date
1  <class 'datetime.date'>  <class 'str'>  <class 'int'>      # <=== datetime.date

df = df.set_index(['date'])
df = df.reset_index()
print(df.loc[df.date == d1]) # Ok !

df.applymap(type)

                      date       location         Number
0  <class 'datetime.date'>  <class 'str'>  <class 'int'>      # <=== datetime.date
1  <class 'datetime.date'>  <class 'str'>  <class 'int'>      # <=== datetime.date


df = df.set_index(['location','date'])  # order has not importance
df = df.reset_index()
print(df.loc[df.date == d1]) # Not OK ! it returns an Empty DataFrame 

df.applymap(type)

        location                                                date         Number
0  <class 'str'>  <class 'pandas._libs.tslibs.timestamps.Timestamp'>  <class 'int'>         # <=== data type changed to pandas._libs.tslibs.timestamps.Timestamp 
1  <class 'str'>  <class 'pandas._libs.tslibs.timestamps.Timestamp'>  <class 'int'>         # <=== data type changed to pandas._libs.tslibs.timestamps.Timestamp 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM