如何从每个键有多个值的长格式 pandas 数据框中有条件地选择每个键的列值？分组，然后如果-那么？

Question

I have a data frame that has a person's name and the positions they have played per year.我有一个数据框，其中包含一个人的姓名和他们每年所扮演的位置。 It is in long format with multiple entries per person.它是长格式，每人有多个条目。 I would like to make 1 data frame for all years with just one entry per person.我想为所有年份制作 1 个数据框，每人只有一个条目。

I am thinking about using groupby for this.我正在考虑为此使用 groupby 。 However, I don't know how to handle the position titles.但是，我不知道如何处理 position 标题。 A person can have either forward, offence, or both.一个人可以有前锋，进攻，或两者兼而有之。 What I would like to do is if a person has entries for forward AND offence, to put their position as "both forward and offence" OR if a person has forward, offence and both, to pick "both forward and offence", OR if a person has just forward, or just offence, to take what they have.我想做的是，如果一个人有前锋和进攻的条目，将他们的 position 作为“前锋和进攻”，或者如果一个人有前锋，进攻和两者，选择“前锋和进攻”，或者如果一个人只是向前，或只是冒犯，拿走他们所拥有的东西。

I have NO idea where to start though.我不知道从哪里开始。 I have tried googling this but I think I don't know the right terms because nothing useful is coming up.我试过用谷歌搜索，但我认为我不知道正确的术语，因为没有任何有用的东西出现。 I am thinking of using group-by with an if-then statement after but I am not sure.我正在考虑使用 group-by 和 if-then 语句，但我不确定。 Any advice or even a suggestion of what terms to google for this would be much much appreciated!非常感谢任何建议，甚至是关于谷歌使用什么条款的建议！

Input dataset:输入数据集：

Name姓名	Position Position
Tom汤姆	Forward向前
Tom汤姆	Offence罪行
Aiden艾登	Forward向前
Aiden艾登	Offence罪行
Aiden艾登	Both Forward and Offence前锋和进攻
Kristy克里斯蒂	Forward向前
Kristy克里斯蒂	Forward向前

data = {'Name': ['Tom', 'Tom', 'Aiden', 'Aiden', 'Aiden', 'Kristy', 'Kristy'], 
        'Position': ['Forward', 'Offence', 'Forward', 'Offence', 
                     'Both Forward and Offence', 'Forward', 'Forward']}
df = pd.DataFrame(data)

Ideal output dataset:理想的 output 数据集：

Name姓名	Position Position
Tom汤姆	Both Forward and Offence前锋和进攻
Aiden艾登	Both Forward and Offence前锋和进攻
Kristy克里斯蒂	Forward向前

Answer 1

You were on the right idea with groupby and if-else .您对groupby和if-else的想法是正确的。 You can see your problem a bit more simple as: if the number of unique position ( nunique ) per name is 1, you want this one, else 'Both Forward and Offence' so a simple way is.您可以更简单地看到您的问题：如果每个名称的唯一 position ( nunique ) 的数量为 1，则您想要这个，否则'Both Forward and Offence' ，所以一个简单的方法是。

res = (
    df.groupby('Name', sort=False)
      ['Position'].apply(lambda x: x.min() if x.nunique()==1 
                                   else 'Both Forward and Offence')
      .reset_index()
)
print(res)
#      Name                  Position
# 0     Tom  Both Forward and Offence
# 1   Aiden  Both Forward and Offence
# 2  Kristy                   Forward

the use of x.min() is to select one value in case like Kristy you have several rows with the same position, but could be x.max() , x.iloc[0] , ... x.min()的使用是 select 一个值，以防像 Kristy 你有几行具有相同的 position，但可以是x.max() ， x.iloc[0] ，...

如何从每个键有多个值的长格式 pandas 数据框中有条件地选择每个键的列值？分组，然后如果-那么？

问题描述

1 个解决方案

解决方案1
1 2021-12-01 12:57:23

如何从每个键有多个值的长格式 pandas 数据框中有条件地选择每个键的列值？ 分组，然后如果-那么？

问题描述

1 个解决方案

解决方案1 1 2021-12-01 12:57:23

如何从每个键有多个值的长格式 pandas 数据框中有条件地选择每个键的列值？分组，然后如果-那么？

解决方案1
1 2021-12-01 12:57:23