简体   繁体   English

用重复项重命名索引

[英]rename index with duplicates

import datetime
dates_list = ['2015-03-28 10:15:36.560000', '2015-03-28 11:35:17.820000',
           '2015-03-29 13:34:54.380000', '2015-03-29 14:10:41.900000',
           '2015-03-31 16:55:43.680000', '2015-03-31 16:57:58.320000',
           '2015-04-02 18:54:31.480000', '2015-04-02 19:46:46.580000',
           '2015-04-03 20:58:27.940000', '2015-04-03 21:30:05.600000']

df = pd.DataFrame(data=[1,2,3,np.nan,5,6,np.nan,np.nan,8,9],columns=['value'],index=[datetime.datetime.strptime(date, '%Y-%m-%d %H:%M:%S.%f') for date in dates_list])

df.index = df.index.date

df
Out[36]: 
            value
2015-03-28    1.0
2015-03-28    2.0
2015-03-29    3.0
2015-03-29    NaN
2015-03-31    5.0
2015-03-31    6.0
2015-04-02    NaN
2015-04-02    NaN
2015-04-03    8.0
2015-04-03    9.0

how can I rename the index so that I have 我如何重命名索引,以便我有

df
Out[36]: 
            value
0    1.0
0    2.0
1    3.0
1    NaN
2    5.0
2    6.0
3    NaN
3    NaN
4    8.0
4    9.0

Use factorize and select first array by [0] : 使用factorize并通过[0]选择第一个数组:

df.index = df.index.factorize()[0]

Or GroupBy.ngroup : GroupBy.ngroup

df.index = df.groupby(level=0).ngroup()

print (df)
   value
0    1.0
0    2.0
1    3.0
1    NaN
2    5.0
2    6.0
3    NaN
3    NaN
4    8.0
4    9.0

虽然不那么整齐,但我们可以使用unique并创建地图。

df.index = pd.Series(df.index).map({k:v for v,k in enumerate(df.index.unique())})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM