[英]How to modify duplicated rows in Python pandas
Let's say I have a DataFrame (that I sorted by some priority criterion) with a " name
" column. 假设我有一个带有“
name
”列的DataFrame(按某种优先级标准排序)。 Few names are duplicated, and I want to append a simple indicator to the duplicates. 很少有重复的名称,我想在重复的名称后面附加一个简单的指示符。
Eg, 例如,
'jones a'
...
'jones a' # this should become 'jones a2'
To get the subset of duplicates, I could do 要获得重复的子集,我可以
df.loc[df.duplicated(subset=['name'], take_last=True), 'name']
However, I think the apply
function does not allow for inplace
modification, right? 但是,我认为
apply
函数不允许inplace
修改,对吗? So what I basically ended up doing is: 所以我最终要做的是:
df.loc[df.duplicated(subset=['name'], take_last=True), 'name'] = \
df.loc[df.duplicated(subset=['name'], take_last=True), 'name'].apply(lambda x: x+'2')
But my feeling is that there might be a better way. 但是我的感觉是可能会有更好的方法。 Any ideas or tips?
有什么想法或提示吗? I would really appreciate your feedback!
非常感谢您的反馈!
Here is one way: 这是一种方法:
# sample data
d = pandas.DataFrame(
{'Name': ['bob', 'bob', 'bob', 'bill', 'fred', 'fred', 'joe', 'larry'],
'ShoeShize': [8, 9, 10, 12, 14, 11, 10, 12]
}
)
>>> d.groupby('Name').Name.apply(lambda n: n + (np.arange(len(n))+1).astype(str))
0 bob1
1 bob2
2 bob3
3 bill1
4 fred1
5 fred2
6 joe1
7 larry1
This appends an indicator to all. 这将为所有指标附加指标。 If you want to append the indicator to only those after the first, you can do it with a little special casing:
如果您只想将指标追加到第一个指标之后,可以使用一些特殊的大小写:
>>> d.groupby('Name').Name.apply(lambda n: n + np.concatenate(([''], (np.arange(len(n))+1).astype(str)[1:])))
0 bob
1 bob2
2 bob3
3 bill
4 fred
5 fred2
6 joe
7 larry
dtype: object
If you want to use this to replace the original names just do d.Name = ...
where ...
is the expression shown above. 如果要使用它替换原始名称,只需执行
d.Name = ...
,其中...
是上面显示的表达式。
You should think about why you're doing this. 您应该考虑为什么要这样做。 It is usually better to have this sort of information in a separate column than smashed into a string.
通常,最好将此类信息放在单独的列中,而不是粉碎成字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.