[英]Select rows from Pandas dataframe where a specific column contains numbers
I have a data frame where a column (column B) can contain a letters, a number or nothing at all.我有一个数据框,其中一列(B 列)可以包含字母、数字或根本不包含任何内容。 Lets say the data frame is:
假设数据框是:
A B C
1 2 Dog
3 C Bird
30 nan Cat
11 4.1 Wolf
And I want to get rows conditionally, based on whether there is a number in column B:我想根据 B 列中是否有数字来有条件地获取行:
A B C
1 2 Dog
11 4.1 Wolf
I have found that I can limit the dataframe to only rows that contain values by entering df.loc[df["B"].notnull()]
.我发现我可以通过输入
df.loc[df["B"].notnull()]
将数据df.loc[df["B"].notnull()]
限制为仅包含值的行。 What I'm trying to find out is whether or not there is an equivalent version of .notnull()
that can select only rows where column B contains a number?我想知道是否有一个等效版本的
.notnull()
只能选择列 B 包含数字的行?
To find integers and decimal numbers, define a function that will take a string as an input, attempt to convert a value to a floating point number (which will succeed if you have an integer or a floating point number), and will handle possible errors: a ValueError
is raised if you pass it a string that can't be converted to a floating point number, and a TypeError
is raised if a null value is passed to float()
, so handle these two exceptions:要查找整数和十进制数,请定义一个函数,该函数将字符串作为输入,尝试将值转换为浮点数(如果您有整数或浮点数,则会成功),并将处理可能的错误:如果传递给它的字符串无法转换为浮点数,则会引发
ValueError
,如果将空值传递给float()
,则会引发TypeError
,因此请处理这两个异常:
def safe_float_convert(x):
try:
float(x)
return True # numeric, success!
except ValueError:
return False # not numeric
except TypeError:
return False # null type
Now use map()
to map the new function to column B of the dataframe, element-wise, and create a boolean mask:现在使用
map()
将新函数按元素映射到数据框的 B 列,并创建一个布尔掩码:
mask = df['B'].map(safe_float_convert)
and use the .loc[]
function, passing in the boolean mask:并使用
.loc[]
函数,传.loc[]
掩码:
numeric_df = df.loc[mask]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.