简体   繁体   English

如何检查熊猫数据框是否仅包含数字列?

[英]How to check if a pandas dataframe contains only numeric column wise?

I want to check every column in a dataframe whether it contains only numeric.我想检查数据框中的每一列是否只包含数字。 How can i find it.我怎样才能找到它。

You can check that using to_numeric and coercing errors:您可以使用to_numeric和强制错误来检查:

pd.to_numeric(df['column'], errors='coerce').notnull().all()

For all columns, you can iterate through columns or just use apply对于所有列,您可以遍历列或仅使用apply

df.apply(lambda s: pd.to_numeric(s, errors='coerce').notnull().all())

Eg例如

df = pd.DataFrame({'col' : [1,2, 10, np.nan, 'a'], 
                   'col2': ['a', 10, 30, 40 ,50],
                   'col3': [1,2,3,4,5.0]})

Outputs输出

col     False
col2    False
col3     True
dtype: bool

You can draw a True / False comparison using isnumeric()您可以使用isnumeric()绘制真/假比较

Example:例子:

 >>> df
       A      B
0      1      1
1    NaN      6
2    NaN    NaN
3      2      2
4    NaN    NaN
5      4      4
6   some   some
7  value  other

Results:结果:

>>> df.A.str.isnumeric()
0     True
1      NaN
2      NaN
3     True
4      NaN
5     True
6    False
7    False
Name: A, dtype: object

# df.B.str.isnumeric()

with apply() method which seems more robust in case you need corner to corner comparison:使用apply()方法,如果您需要角到角比较,它似乎更健壮:

DataFrame having two different columns one with mixed type another with numbers only for test: DataFrame 有两个不同的列,一个是混合类型,另一个是数字,仅用于测试:

>>> df
       A   B
0      1   1
1    NaN   6
2    NaN  33
3      2   2
4    NaN  22
5      4   4
6   some  66
7  value  11

Result:结果:

>>> df.apply(lambda x: x.str.isnumeric())
       A     B
0   True  True
1    NaN  True
2    NaN  True
3   True  True
4    NaN  True
5   True  True
6  False  True
7  False  True

Another example:另一个例子:

Let's consider the below dataframe with different data-types as follows..让我们考虑以下具有不同数据类型的数据框,如下所示..

>>> df
   num  rating    name  age
0    0    80.0  shakir   33
1    1   -22.0   rafiq   37
2    2   -10.0     dev   36
3  num     1.0   suraj   30

Based on the comment from OP on this answer, where it has negative value and 0's in it.根据 OP 对此答案的评论,其中包含负值和 0。

1- This is a pseudo-internal method to return only the numeric type data. 1- 这是一种仅返回数字类型数据的伪内部方法。

>>> df._get_numeric_data()
   rating  age
0    80.0   33
1   -22.0   37
2   -10.0   36
3     1.0   30

OR或者

2- there is an option to use method select_dtypes in module pandas.core.frame which return a subset of the DataFrame's columns based on the column dtypes . 2- 有一个选项可以在模块 pandas.core.frame 中使用select_dtypes方法,它根据列dtypes返回 DataFrame 列的子集。 One can use Parameters with include, exclude options.可以将Parametersinclude, exclude选项一起使用。

>>> df.select_dtypes(include=['int64','float64']) # choosing int & float
   rating  age
0    80.0   33
1   -22.0   37
2   -10.0   36
3     1.0   30

>>> df.select_dtypes(include=['int64'])  # choose int
   age
0   33
1   37
2   36
3   30

This will return True if all columns are numeric, False otherwise.如果所有列都是数字,这将返回 True,否则返回 False。

df.shape[1] == df.select_dtypes(include=np.number).shape[1]

To select numeric columns:要选择数字列:

new_df = df.select_dtypes(include=np.number)

Let's say you have a dataframe called df , if you do:假设您有一个名为df的数据df ,如果您这样做:

df.select_dtypes(include=["float", 'int'])

This will return all the numeric columns, you can check if this is the same as the original df .这将返回所有数字列,您可以检查这是否与原始df相同。

Otherwise, you can also use the exclude parameter:否则,您还可以使用exclude参数:

df.select_dtypes(exclude=["float", 'int'])

and check if this gives you an empty dataframe.并检查这是否为您提供了一个空的数据框。

The accepted answers seem bit overkill, as they sub-select the entire dataframe.接受的答案似乎有点矫枉过正,因为它们对整个数据帧进行了子选择。

To check types only metadata should be used, which can be done with pd.api.types.is_numeric_dtype .要检查类型,只应使用元数据,这可以通过pd.api.types.is_numeric_dtype完成。

import pandas as pd
df = pd.DataFrame(data=[[1,'a']],columns=['numeruc_col','string_col'])

print(df.columns[list(map(pd.api.types.is_numeric_dtype,df.dtypes))]) # one way
print(df.dtypes.map(pd.api.types.is_numeric_dtype)) # another way

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM