简体   繁体   English

根据条件更改熊猫数据框列的值,也取决于数据框的其他列

[英]Change the value of a pandas dataframe column based on a condition ,also depending on other columns of the dataframe

    Category              DishName   Id 
0   a                     Pistachio  621f4884e48bc60012364b13   
1   a                     Pistachio  621f4884e48bc60012364b13   
2   a                     Pistachio  621f4884e48bc60012364b13   
3   a                     achar      621f4884e48bc60012364b13   
4   b                     achar      621f4884e48bc60012364b13   
5   b                     achar      621f4884e48bc60012364b13   
6   a                     chicken    621f4884e48bc60012364b13   
7   b                     chicken    621f4884e48bc60012364b13   
8   c                     chicken    621f4884e48bc60012364b13 

My dataframe has 3 columns category, dishname and id.我的数据框有 3 列类别、菜名和 ID。 Considering the id and the dishname I have to assign category.考虑到 id 和菜名,我必须分配类别。

Assign "a" if all the category values are "a"如果所有类别值都是“a”,则分配“a”

Assign "b" if category values are "a","b"如果类别值为“a”、“b”,则分配“b”

Assign "c" if category values are "a","b","c"如果类别值为“a”、“b”、“c”,则分配“c”

Expected output is预期输出为

    Category              DishName   Id 
0   a                     Pistachio  621f4884e48bc60012364b13   
1   a                     Pistachio  621f4884e48bc60012364b13   
2   a                     Pistachio  621f4884e48bc60012364b13   
3   b                     achar      621f4884e48bc60012364b13   
4   b                     achar      621f4884e48bc60012364b13   
5   b                     achar      621f4884e48bc60012364b13   
6   c                     chicken    621f4884e48bc60012364b13   
7   c                     chicken    621f4884e48bc60012364b13   
8   c                     chicken    621f4884e48bc60012364b13 

You can transform to ordered Categorical and get the max per group:您可以转换为有序分类并获得每组的最大值:

df['Category'] = (pd
                  .Series(pd.Categorical(df['Category'],
                                         categories=['a', 'b', 'c'], ordered=True),
                          index=df.index)
                  .groupby(df['DishName'])
                  .transform('max')
                  )

NB.注意。 You wouldn't need the categorical for simply a, b, c , as those three are lexicographically sorted, but I imagine a real life case wouldn't necessarily be.您不需要简单的分类a, b, c ,因为这三个是按字典顺序排序的,但我想现实生活中的情况不一定如此。 As example low < medium < high is logically but not lexicographically sorted.例如low < medium < high在逻辑上但不是按字典排序。

Output:输出:

  Category   DishName                        Id
0        a  Pistachio  621f4884e48bc60012364b13
1        a  Pistachio  621f4884e48bc60012364b13
2        a  Pistachio  621f4884e48bc60012364b13
3        b      achar  621f4884e48bc60012364b13
4        b      achar  621f4884e48bc60012364b13
5        b      achar  621f4884e48bc60012364b13
6        c    chicken  621f4884e48bc60012364b13
7        c    chicken  621f4884e48bc60012364b13
8        c    chicken  621f4884e48bc60012364b13
df['Category'] = df.groupby('DishName')['Category'].transform('max')

Output:输出:

  Category   DishName                        Id
0        a  Pistachio  621f4884e48bc60012364b13
1        a  Pistachio  621f4884e48bc60012364b13
2        a  Pistachio  621f4884e48bc60012364b13
3        b      achar  621f4884e48bc60012364b13
4        b      achar  621f4884e48bc60012364b13
5        b      achar  621f4884e48bc60012364b13
6        c    chicken  621f4884e48bc60012364b13
7        c    chicken  621f4884e48bc60012364b13
8        c    chicken  621f4884e48bc60012364b13

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据 dataframe 中的其他列更改 pandas dataframe 列值 - Change pandas dataframe column values based on other columns in dataframe Pandas dataframe select 列基于其他 Z6A8064B5DF479455500553 列中的值47DC - Pandas dataframe select Columns based on other dataframe contains column value in it 根据条件和前一行值从其他列填充 Pandas Dataframe 列 - Populate Pandas Dataframe column from other columns based on a condition and previous row value 根据同一pandas数据框中的其他列为列分配值 - Assign value to a column based of other columns from the same pandas dataframe 根据Pandas数据帧中其他列的值设置列的值 - Setting value of a column based on values of other columns in Pandas dataframe 基于其他列向 pandas dataframe 添加列 - Adding a column to a pandas dataframe based on other columns Pandas DataFrame 根据条件从其他两列中添加带有文本的列 - Pandas DataFrame add column with text from two other columns depending on condition 根据其他数据框列更改列值 - Change column values based on other dataframe columns 如何根据“标识符列”和熊猫数据框中的其他条件替换值? - How to replace a value depending on “identifier columns” and an additional condition in a pandas dataframe? 根据条件将一个 dataframe 中的列值设置为另一个 dataframe 列 - Setting value of columns in one dataframe to another dataframe column based on condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM