[英]Python pandas: Updating row based on value from another column
I have a pandas dataframe, df, like: 我有一个熊猫数据框df,例如:
name | grade | grade_type
---------------------------
sarah | B | letter
alice | A | letter
eliza | C | letter
beth | 76 | numeral
jones | 90 | numeral
All values in df
are strings, including the numbers. df
中的所有值都是字符串,包括数字。 I want to convert the grade
numeric values into letters, based on checking the grade_type
column, to get: 我希望根据检查
grade_type
列将grade
数值转换为字母,以获得:
name | grade | grade_type
---------------------------
sarah | B | letter
alice | A | letter
eliza | C | letter
beth | B | numeral
jones | A | numeral
For completeness, the numeral-to-letter grade conversions are: 为了完整起见,数字到字母的等级转换是:
A: grade > 80
B: 70 < grade <= 80
C: 60 < grade <= 70
Why doesn't this work? 为什么不起作用?
for index, row in df.iterrows():
if row.grade_type == "numeral":
grade_val = int(row.grade.values[0])
if grade_val > 80:
row.grade = "A" # This assignment doesn't update row.grade!
elif...
The alternative is using df.apply(...lambda:...)
, but I'm not too sure how to pull that off, since we have to check the grade_type
column before deciding whether or not to update the grade
value. 另一种方法是使用
df.apply(...lambda:...)
,但是我不太确定如何实现这一点,因为在决定是否更新grade
值之前,我们必须检查grade_type
列。
The reason that your DataFrame doesn't update is because rows returned from iterrows() : are copies. DataFrame不更新的原因是因为iterrows()返回的行是副本。 And you're working on that copy.
您正在处理该副本。
You can use the index
returned from iterrows and manipulate DataFrame directly: 您可以使用从迭代返回的
index
并直接操作DataFrame:
for index, row in df.iterrows():
grade_val = int(row.grade.values[0])
if grade_val > 80:
df.loc[index, 'grade'] = 'A'
...
Or as you said you can use df.apply() , and pass it a custom function: 或者如您所说,您可以使用df.apply()并将其传递给自定义函数:
def get_grades(x):
if x['grade_type'] == 'letter':
return(x['grade_val'])
if x['grade_val'] > 80:
return "A"
...
df['grade'] = df.apply(lambda x: get_grades(x), axis=1)
You can also use if
else
in your lambda to check if x['grade_type']
is numeric as follows, use the one that looks easier to read. 您还可以使用lambda中的
if
else
来检查x['grade_type']
是否为数字,如下所示,使用看起来更容易理解的数字。
def get_grades(grade_val):
if grade_val > 80:
return "A"
...
df['grade'] = df.apply(lambda x: get_grades(x['grade'])
if x['grade_type'] == 'numeral' else x['grade'], axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.