熊猫：如何按列选择第一个或最后一个与 drop_duplicates 保持一致

Question

As shown below, name must be keep in fisrt and team in last .如下图，name 必须放在fisrt中， team 放在last中。

How can I accomplish this with .drop_duplicates() or otherwise?如何使用.drop_duplicates()或其他方式完成此操作？

   name  team ...
0  john  a    ...
1  mike  b    ...
2  john  c

↓

   name  team ...
0  john  c    ...
1  mike  b    ...

-- Additional note about comments -- -- 关于评论的补充说明 --

.groupby('name').agg({'team': 'last', 'country': 'first'})

The way it works now, if the first line of country is Nan If the first line of country is Nan, a value that is not the first will be obtained as follows.现在的工作方式，如果country的第一行是Nan如果 country 的第一行是 Nan，那么会得到一个不是first一个的值，如下所示。

Is this because the case of Nan is ignored?这是因为Nan的案子被忽略了吗？ Even if first is specified and first is Nan , Nan must still be retained.即使指定了first first Nan ， Nan仍然必须保留。

   name  team  country ...
0  john   a    Nan     ...
1  mike  b     Brazil  ...
2  john  c     Canada  ...

↓

   name  team  country ...
0  john  c     Canada  ...
1  mike  b     Brazil  ...

Answer 1

You can use the .groupby() function:您可以使用.groupby()函数：

df.groupby('name').agg({'team': 'last'}) . df.groupby('name').agg({'team': 'last'}) 。

Be aware that in the value that's returned per name is dependent on the sorting of your dataframe.请注意，每个名称返回的值取决于数据框的排序。

熊猫：如何按列选择第一个或最后一个与 drop_duplicates 保持一致

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-06-04 18:32:59

熊猫：如何按列选择第一个或最后一个与 drop_duplicates 保持一致

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-06-04 18:32:59

解决方案1
1 已采纳 2022-06-04 18:32:59