如何根据条件在 pandas dataframe 中创建一个新列？

Question

I have a data frame with the following columns:我有一个包含以下列的数据框：

d = {'find_no': [1, 2, 3], 'zip_code': [32351, 19207, 8723]}
df = pd.DataFrame(data=d)

When there are 5 digits in the zip_code column, I want to return True.当 zip_code 列中有 5 位数字时，我想返回 True。 When there are not 5 digits, I want to return the "find_no".当没有 5 位数字时，我想返回“find_no”。 Sample output would have the results in an added column to the dataframe, corresponding to the row it's referencing.示例 output 的结果将添加到 dataframe 的列中，对应于它引用的行。

Answer 1

You could try np.where:你可以试试 np.where：

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, True, df['find_no'])

Only downside with this approach is that NumPy will convert your True values to 1's, which could be confusing.这种方法的唯一缺点是 NumPy 会将您的 True 值转换为 1，这可能会造成混淆。 An approach to keep the values you want is to do保持你想要的价值观的一种方法是

import numpy as np

df['result'] = np.where(df['zip_code'].astype(str).str.len() == 5, 'True', df['find_no'].astype(str))

The downside here being that you lose the meaning of those values by casting them to strings.这里的缺点是你通过将它们转换为字符串而失去了这些值的意义。 I guess it all depends on what you're hoping to accomplish.我想这完全取决于您希望实现的目标。

如何根据条件在 pandas dataframe 中创建一个新列？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-05-19 15:14:18

如何根据条件在 pandas dataframe 中创建一个新列？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-05-19 15:14:18

解决方案1
0 已采纳 2022-05-19 15:14:18