[英]Fill missing coumns in a Pandas DataFrame?
There are lots of questions about filling missing values.关于填充缺失值有很多问题。
I want to fill whole missing columns.我想填充整个缺失的列。
Suppose I have:假设我有:
df=pd.DataFrame([1,2], columns=['A'])
A
0 1
1 2
What's the idiomatic way to do something like this?做这样的事情的惯用方式是什么?
df.fillmissing(['A','B','C'])
My current code:我当前的代码:
for name in colnames:
if name not in df:
df[name] = None
This produces:这会产生:
A B C
0 1 NaN NaN
1 2 NaN NaN
Explanation of output: output说明:
In this case A
is a no-op, but B
and C
get added, ie:在这种情况下A
是空操作,但B
和C
被添加,即:
Any suggestions?有什么建议么?
Perhaps you need reindex
:也许您需要reindex
:
df.reindex(['A', 'B', 'C'], axis=1)
A B C
0 1 NaN NaN
1 2 NaN NaN
This fills missing columns with NaN, leaving existing columns as-is.这将用 NaN 填充缺失的列,使现有列保持原样。
You could also try transposing it then reindex
it:您也可以尝试转置它然后reindex
它:
print(df.T.reindex(['A', 'B', 'C']).T)
Let's say you have the following dataframe df
and lookup columns cols
:假设您有以下 dataframe df
和查找列cols
:
df=pd.DataFrame([1,2], columns=['A'])
cols = ['A', 'B', 'C']
From there, you can subtract the list of the columns from the dataframe from the list of columns from cols
and create the new columns all at once (that don't exist after subtraction) and set to None
.从那里,您可以从cols
的列列表中减去 dataframe 中的列列表,并一次创建新列(减去后不存在)并设置为None
。 Note: You cannot directly subtract lists from each other unless you convert to a set
first.注意:除非先转换为set
,否则不能直接将列表彼此相减。 Then, enclose [*]
around the set to transform to a list:然后,将[*]
括在集合周围以转换为列表:
Method 1: Set Subtraction方法一:设置减法
df[[*set(cols) - set(df.columns)]] = None
df
Out[1]:
A B C
0 1 None None
1 2 None None
The list comprehension way would be:列表理解方式是:
Method 2: List Comprehension (similar to your for loop)方法 2:列表理解(类似于你的 for 循环)
df[[col for col in cols if col not in df.columns]] = None
df
Out[1]:
A B C
0 1 None None
1 2 None None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.