[英]How to find index of the first unique elements in Pandas DataFrame?
考虑
df1 = pd.DataFrame("Name":["Adam","Joseph","James","James","Kevin","Kevin","Kevin","Peter","Peter"])
我想获取 dataframe 中唯一值的索引。
当我做df1["Name"].unique()
我得到 output 为
['Adam','Joseph','James','Kevin','Peter']
但我想获取每个值第一次出现的位置
[0,1,2,4,7]
我建议使用numpy.unique
并将return_index
设为 True。
np.unique(df1, return_index=True)
Out[13]:
(array(['Adam', 'James', 'Joseph', 'Kevin', 'Peter'], dtype=object),
array([0, 2, 1, 4, 7], dtype=int64))
numpy 答案很好,但这是一种解决方法:
out = df1.reset_index().groupby(['Name'])['index'].min().to_list()
output:
[0, 1, 2, 4, 7]
使用RANK检查下面的代码
df1['rank'] = df1.groupby(['Name'])['Name'].rank(method='first')
df1[df1['rank'] == 1].index
Int64Index([0, 1, 2, 4, 7], dtype='int64')
第一场比赛=第一个位置
In[49]: import pandas as pd
...: df1 = pd.DataFrame({"Name":["Adam","Joseph","James","James","Kevin","Kevin","Kevin","Peter","Peter"]})
...: print ([df1.loc[df1['Name']==i].index[0] for i in df1['Name'].unique()])
...:
[0, 1, 2, 4, 7]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.