熊猫如何去除重复的价值

Question

name    date   
a       [01-01,01-01,01-03]
b       [02-01.03-03.03-03,03-05]
..       ..
..       ..

this is my dataframe 这是我的数据框
data was having a duplicated id and date so i make groupby id 数据具有重复的ID和日期，所以我使groupby ID

df=DataFrame(data)
uid=df['uid']
dt=df['dt']

df1=pd.Series(uid,name='uid')
df3=pd.Series(dt,name='dt')

df=pd.concat([df1,df3], axis=1,ignore_index=True)
df.groupby(uid, as_index=False).agg(lambda x: x.tolist())

my desired output is like this 我想要的输出是这样的

 name    date   
a       [01-01,01-03]
b       [02-01,03-03,03-05]
..       ..
..       ..

Answer 1

尝试：

df.date = df.date.apply(lambda x: list(set(x)))

Answer 2

if you want to remove duplicates and also sort them based on initial order. 如果您要删除重复项并根据初始顺序对其进行排序。 see example below: 请参见下面的示例：

df = pd.DataFrame.from_dict({'name':['a','b'], 'date': [['01-01','01-01','01-03'],['02-01','03-03','03-03','03-05']]})
print 'before removing duplicates'
print df

print 'after removing duplicates and sorting based on initial order'
df['date'] = df['date'].apply(lambda x: sorted(list(set(x)), key = x.index))
print df

results in 结果是

before removing duplicates
                           date name
0         [01-01, 01-01, 01-03]    a
1  [02-01, 03-03, 03-03, 03-05]    b

after removing duplicates and sorting based on initial order
                    date name
0         [01-01, 01-03]    a
1  [02-01, 03-03, 03-05]    b

熊猫如何去除重复的价值

问题描述

2 个解决方案

解决方案1
2 2017-04-28 19:32:25

解决方案2
0 2017-04-28 19:40:24

熊猫如何去除重复的价值

问题描述

2 个解决方案

解决方案1 2 2017-04-28 19:32:25

解决方案2 0 2017-04-28 19:40:24

解决方案1
2 2017-04-28 19:32:25

解决方案2
0 2017-04-28 19:40:24