简体   繁体   English

根据具有条件的其他列设置列的值

[英]Set value of a column based on other column with an condition

I encountered a problem and it can probably be done by iterating over the rows of this DataFrame but there might be a more elegant solution.我遇到了一个问题,它可能可以通过遍历此 DataFrame 的行来完成,但可能有更优雅的解决方案。

I try to create the column 'desired' as a string if the value of col1 is higher than 20. I tried np.where without success.如果 col1 的值高于 20,我尝试将“desired”列创建为字符串。我尝试了 np.where 但没有成功。

DataFrame数据框

Who can help me?谁能帮我? Thanks!谢谢!

This should work:这应该工作:

df['desired'] = ''
df.loc[df['col1'] > 20, 'desired'] = 'col1 is ' + df['col1'].astype(str)

Example例子

import pandas as pd

df = pd.DataFrame({'col1': [25, 10, 15, 21]})

df['desired'] = ''
df.loc[df['col1'] > 20, 'desired'] = 'col1 is ' + df['col1'].astype(str)

#    col1     desired
# 0    25  col1 is 25
# 1    10            
# 2    15            
# 3    21  col1 is 21

The problem with this这个问题

The power of pandas is in holding structured data. pandas 的力量在于保存结构化数据。 As soon as you combine strings with numeric data, you lose that structure.一旦将字符串与数字数据组合在一起,就会失去这种结构。 Manipulating strings is tedious, for example you can't add 1 to the "desired" column.操作字符串很乏味,例如,您不能将 1 添加到“所需”列。

A better idea一个更好的主意

It is better to use a Boolean column to signify a desired condition.最好使用布尔列来表示所需的条件。 For example:例如:

df['desired'] = df['col1'] > 20

This will give a Boolean [True or False] series depending on the condition specified.这将根据指定的条件给出布尔值 [True or False] 系列。

Use numpy.where for new column by condition:按条件使用numpy.where作为新列:

df['desired'] = np.where(df['col1'] > 20, 'col1 is ' + df['col1'].astype(str), '')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM