[英]How to check if a column exists in Pandas
How do I check if a column exists in a Pandas DataFrame df
?如何检查 Pandas DataFrame
df
中是否存在列?
A B C
0 3 40 100
1 6 30 200
How would I check if the column "A"
exists in the above DataFrame so that I can compute:我将如何检查上述 DataFrame 中是否存在列
"A"
以便我可以计算:
df['sum'] = df['A'] + df['C']
And if "A"
doesn't exist:如果
"A"
不存在:
df['sum'] = df['B'] + df['C']
This will work:这将起作用:
if 'A' in df:
But for clarity, I'd probably write it as:但为了清楚起见,我可能会写成:
if 'A' in df.columns:
To check if one or more columns all exist, you can use set.issubset
, as in:要检查一个或多个列是否都存在,您可以使用
set.issubset
,如下所示:
if set(['A','C']).issubset(df.columns):
df['sum'] = df['A'] + df['C']
As @brianpck points out in a comment, set([])
can alternatively be constructed with curly braces,正如@brianpck 在评论中指出的那样,
set([])
也可以用花括号构造,
if {'A', 'C'}.issubset(df.columns):
See this question for a discussion of the curly-braces syntax.有关花括号语法的讨论,请参阅此问题。
Or, you can use a generator comprehension, as in:或者,您可以使用生成器推导,如:
if all(item in df.columns for item in ['A','C']):
Just to suggest another way without using if statements, you can use the get()
method for DataFrame
s.只是为了建议另一种不使用 if 语句的方法,您可以对
DataFrame
使用get()
方法。 For performing the sum based on the question:根据问题执行求和:
df['sum'] = df.get('A', df['B']) + df['C']
The DataFrame
get method has similar behavior as python dictionaries. DataFrame
的 get 方法具有与 python 字典类似的行为。
You can use the set's method issuperset
:您可以使用集合的方法
issuperset
:
set(df).issuperset(['A', 'B'])
# set(df.columns).issuperset(['A', 'B'])
You can also call isin()
on the columns to check if it exists and call any()
on the result to reduce it to a single boolean value 1 :您还可以在列上调用
isin()
以检查它是否存在并在结果上调用any()
以将其减少为单个 boolean 值1 :
if df.columns.isin(['A', 'C']).any():
# do something
To check if a column name is not present, you can use the not
operator in the if-clause:要检查列名是否不存在,可以在 if 子句中使用
not
运算符:
if 'A' not in df:
# do something
or along with the isin().any()
call.或与
isin().any()
调用一起。
if not df.columns.isin(['A', 'C']).any():
# do something
1: isin()
call on the columns returns a boolean array whose values are True if it's either A
or C
and False otherwise. 1:对列的 isin
isin()
调用返回一个 boolean 数组,如果它是A
或C
,则其值为 True,否则为 False。 The truth value of an array is ambiguous, so any()
call reduces it to a single True/False value.数组的真值是不明确的,因此
any()
调用将其简化为单个 True/False 值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.