简体   繁体   English

Python Pandas:从value不为null的其他列中创建新列

[英]Python Pandas: Create new column out of other columns where value is not null

I have a data frame like this 我有一个这样的数据框

----------------
RecID| A  |B
----------------
1    |NaN | x 
2    |y   | NaN 
3    |z   | NaN
4    |NaN | a 
5    |NaN | b 

And I want to create a new column, C, from A and B such that if A is null then fill with B and if B is null then fill with A: 我想从A和B创建一个新列C,这样,如果A为空则用B填充,如果B为空然后用A填充:

----------------------
RecID|A   |B    |C 
----------------------
1    |NaN | x   |x
2    |y   | NaN |y 
3    |z   | NaN |z
4    |NaN | a   |a
5    |NaN | b   |b

Lastly, is there an efficient way to do this if I have more than two columns, eg I have columns AZ and want create a new column A1 out of columns AZ similar to above? 最后,如果我有两个以上的列,例如,我有AZ列,并想从AZ列中创建一个新的A1列,是否有一种有效的方法呢?

pandas
lookup
This is the generalizable solution OP was looking for and will work across an arbitrary number of columns. 这是OP一直在寻找的通用解决方案,并且可以在任意数量的列中使用。

lookup = df.loc[:, 'A':'B'].notnull().idxmax(1)
df.assign(A1=df.lookup(lookup.index, lookup.values))

   RecID    A    B A1
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b

fillna

df.assign(C=df.A.fillna(df.B))

   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b

mask

df.assign(C=df.A.mask(df.A.isnull(), df.B))

   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b

combine_first

df.assign(C=df.A.combine_first(df.B))

   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b

numpy
np.where

df.assign(C=np.where(df.A.notnull(), df.A, df.B))

   RecID    A    B  C
0      1  NaN    x  x
1      2    y  NaN  y
2      3    z  NaN  z
3      4  NaN    a  a
4      5  NaN    b  b

In the case of multiple columns, you can use forward fill. 如果是多列,则可以使用正向填充。 This example assumes that you want to build a combination of all columns 'A' through 'Z': 本示例假定您要构建所有列“ A”至“ Z”的组合:

df['AZ'] = df.loc[:,'A':'Z'].fillna(method='ffill',axis=1)['Z']

This method works for two columns, too: 此方法也适用于两列:

df['C'] = df.loc[:,'A':'B'].fillna(method='ffill',axis=1)['B']
#0    x
#1    y
#2    z
#3    a
#4    b

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python pandas dataframe从其他列的单元格创建新列 - python pandas dataframe create new column from other columns' cells 在python pandas DataFrame上使用其他列的信息创建新列 - create new columns with info of other column on python pandas DataFrame Python:使用其他列将Pandas中的新列的值分配为列表 - Python: Assign value to a new column in Pandas as list using other columns 按列连接数据框并按值创建新列 Pandas Python - Join Dataframes by column and create new columns by value Pandas Python Python Pandas:为特定列值的每个实例创建新列 - Python Pandas: Create New Columns For Each Instance of A Particular Column Value 如何根据其他 pandas 列和关联的字符串列的最大值创建新的 pandas 列? - How can I create a new pandas column based on the max value of other pandas columns and the associated string column? Pandas/Python:如何根据其他列的值创建新列并将额外条件应用于此新列 - Pandas/Python: How to create new column based on values from other columns and apply extra condition to this new column 从 python pandas 中的其他列创建列 - create column from other columns in python pandas Pandas 使用其他列的值创建新列,根据列值选择 - Pandas create new column with values from other columns, selected based on column value 根据其他行和列的多个条件在数据框中创建新列? 包括空行? - 蟒蛇/熊猫 - Creating a new column in dataframe based on multiple conditions from other rows and columns? Including rows that are null? - Python/Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM