[英]Pandas: Creating new column based on values from existing column
I have a pandas dataframe with two columns as following:我有一个 pandas dataframe 有两列如下:
A B
Yes No
Yes Yes
No Yes
No No
NA Yes
NA NA
I want to create a new column based on these values such that if any of the column values are Yes
, the value in the new column should also be Yes
.我想根据这些值创建一个新列,这样如果任何列值是Yes
,新列中的值也应该Yes
。 If both columns have the value No
, the new column would also have the value No
.如果两列都具有值No
,则新列也将具有值No
。 And finally, if both columns has value NA
, the output would also have NA
for the new column.最后,如果两列都具有值NA
,则 output 也将具有新列的NA
。 Example output for above data is:上述数据的示例 output 为:
C
Yes
Yes
Yes
No
Yes
NA
I wrote a loop over the length of dataframe and then checks for each value to get a new column.我在 dataframe 的长度上编写了一个循环,然后检查每个值以获得一个新列。 However, it takes a long time for 10M records.但是,10M 的记录需要很长时间。 Is there a faster pythonic way to achieve this?有没有更快的pythonic方法来实现这一点?
Something like就像是
df.fillna('').max(axis=1)
Out[106]:
0 Yes
1 Yes
2 Yes
3 No
4 Yes
5
dtype: object
Try:尝试:
(df == 'Yes').eval('A | B').astype(str).mask(df['A'].isna() & df['B'].isna())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.