Pandas：根据现有列的值创建新列

Question

I have a pandas dataframe with two columns as following:我有一个 pandas dataframe 有两列如下：

A      B
Yes    No
Yes    Yes
No     Yes
No     No
NA     Yes
NA     NA

I want to create a new column based on these values such that if any of the column values are Yes , the value in the new column should also be Yes .我想根据这些值创建一个新列，这样如果任何列值是Yes ，新列中的值也应该Yes 。 If both columns have the value No , the new column would also have the value No .如果两列都具有值No ，则新列也将具有值No 。 And finally, if both columns has value NA , the output would also have NA for the new column.最后，如果两列都具有值NA ，则 output 也将具有新列的NA 。 Example output for above data is:上述数据的示例 output 为：

C
Yes
Yes
Yes
No
Yes
NA

I wrote a loop over the length of dataframe and then checks for each value to get a new column.我在 dataframe 的长度上编写了一个循环，然后检查每个值以获得一个新列。 However, it takes a long time for 10M records.但是，10M 的记录需要很长时间。 Is there a faster pythonic way to achieve this?有没有更快的pythonic方法来实现这一点？

Answer 1

Something like就像是

df.fillna('').max(axis=1)
Out[106]: 
0    Yes
1    Yes
2    Yes
3     No
4    Yes
5       
dtype: object

Answer 2

Try:尝试：

(df == 'Yes').eval('A | B').astype(str).mask(df['A'].isna() & df['B'].isna())

Answer 3

Another way of doing it.另一种方法。 Hard corded though虽然硬线

conditions=((df['A']=='Yes')|(df['B']=='Yes'),(df['A']=='No')&(df['B']=='No'),(df['A']=='NaN')&(df['B']=='NaN'))
choicelist=('Yes','No','NaN')
df['C']=np.select(conditions, choicelist)
df

Pandas：根据现有列的值创建新列

问题描述

3 个解决方案

解决方案1
7 已采纳 2020-05-01 22:10:34

解决方案2
2 2020-05-01 22:03:00

解决方案3
0 2020-05-01 22:19:10

Pandas：根据现有列的值创建新列

问题描述

3 个解决方案

解决方案1 7 已采纳 2020-05-01 22:10:34

解决方案2 2 2020-05-01 22:03:00

解决方案3 0 2020-05-01 22:19:10

解决方案1
7 已采纳 2020-05-01 22:10:34

解决方案2
2 2020-05-01 22:03:00

解决方案3
0 2020-05-01 22:19:10