Python Pandas：根据另一列的值更新行

Question

I have a pandas dataframe, df, like: 我有一个熊猫数据框df，例如：

name   | grade | grade_type
---------------------------
sarah  | B     | letter  
alice  | A     | letter
eliza  | C     | letter
beth   | 76    | numeral
jones  | 90    | numeral

All values in df are strings, including the numbers. df中的所有值都是字符串，包括数字。 I want to convert the grade numeric values into letters, based on checking the grade_type column, to get: 我希望根据检查grade_type列将grade数值转换为字母，以获得：

name   | grade | grade_type
---------------------------
sarah  | B     | letter  
alice  | A     | letter
eliza  | C     | letter
beth   | B     | numeral
jones  | A     | numeral

For completeness, the numeral-to-letter grade conversions are: 为了完整起见，数字到字母的等级转换是：

A: grade > 80
B: 70 < grade <= 80
C: 60 < grade <= 70

Why doesn't this work? 为什么不起作用？

for index, row in df.iterrows():
  if row.grade_type == "numeral":
    grade_val = int(row.grade.values[0])
    if grade_val > 80:
      row.grade = "A" # This assignment doesn't update row.grade!
    elif...

The alternative is using df.apply(...lambda:...) , but I'm not too sure how to pull that off, since we have to check the grade_type column before deciding whether or not to update the grade value. 另一种方法是使用df.apply(...lambda:...) ，但是我不太确定如何实现这一点，因为在决定是否更新grade值之前，我们必须检查grade_type列。

Answer 1

The reason that your DataFrame doesn't update is because rows returned from iterrows() : are copies. DataFrame不更新的原因是因为iterrows（）返回的行是副本。 And you're working on that copy. 您正在处理该副本。

You can use the index returned from iterrows and manipulate DataFrame directly: 您可以使用从迭代返回的index并直接操作DataFrame：

for index, row in df.iterrows():
    grade_val = int(row.grade.values[0])
    if grade_val > 80:
        df.loc[index, 'grade'] = 'A'
    ...

Or as you said you can use df.apply() , and pass it a custom function: 或者如您所说，您可以使用df.apply（）并将其传递给自定义函数：

def get_grades(x):
    if x['grade_type'] == 'letter':
        return(x['grade_val']) 
    if x['grade_val'] > 80:
        return "A"
    ...


df['grade'] = df.apply(lambda x: get_grades(x), axis=1)

You can also use if else in your lambda to check if x['grade_type'] is numeric as follows, use the one that looks easier to read. 您还可以使用lambda中的if else来检查x['grade_type']是否为数字，如下所示，使用看起来更容易理解的数字。

def get_grades(grade_val):
    if grade_val > 80:
        return "A"
    ...

df['grade'] = df.apply(lambda x: get_grades(x['grade']) 
                       if x['grade_type'] == 'numeral' else x['grade'], axis=1)

Python Pandas：根据另一列的值更新行

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-04-20 01:27:54

Python Pandas：根据另一列的值更新行

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-04-20 01:27:54

解决方案1
3 已采纳 2017-04-20 01:27:54