简体   繁体   English

错误:系列的真值不明确 - Python pandas

[英]Error: The truth value of a Series is ambiguous - Python pandas

I know this question has been asked before, however, when I am trying to do an if statement and I am getting an error.我知道之前有人问过这个问题,但是,当我尝试执行if语句时出现错误。 I looked at this link , but did not help much in my case.我查看了此链接,但对我的情况没有多大帮助。 My dfs is a list of DataFrames.我的dfs是一个 DataFrame 列表。

I am trying the following,我正在尝试以下操作,

for i in dfs:
    if (i['var1'] < 3.000):
       print(i)

Gives the following error:给出以下错误:

ValueError: The truth value of a Series is ambiguous. ValueError:系列的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all().使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。

AND I tried the following and getting the same error.我尝试以下,并得到同样的错误。

for i,j in enumerate(dfs):
    if (j['var1'] < 3.000):
       print(i)

My var1 data type is float32 .我的var1数据类型是float32 I am not using any other logical operators and & or |我没有使用任何其他logical运算符和&| . . In the above link it seemed to be because of using logical operators.在上面的链接中,这似乎是因为使用了逻辑运算符。 Why do I get ValueError ?为什么我得到ValueError

Here is a small demo, which shows why this is happenning:这是一个小演示,它说明了为什么会发生这种情况:

In [131]: df = pd.DataFrame(np.random.randint(0,20,(5,2)), columns=list('AB'))

In [132]: df
Out[132]:
    A   B
0   3  11
1   0  16
2  16   1
3   2  11
4  18  15

In [133]: res = df['A'] > 10

In [134]: res
Out[134]:
0    False
1    False
2     True
3    False
4     True
Name: A, dtype: bool

when we try to check whether such Series is True - Pandas doesn't know what to do:当我们尝试检查此类 Series 是否为True - Pandas 不知道该怎么做:

In [135]: if res:
     ...:     print(df)
     ...:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
skipped
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Workarounds:解决方法:

we can decide how to treat Series of boolean values - for example if should return True if all values are True :我们可以决定如何处理一系列布尔值-例如if要返回True如果所有值均为True

In [136]: res.all()
Out[136]: False

or when at least one value is True:或者当至少一个值为 True 时:

In [137]: res.any()
Out[137]: True

In [138]: if res.any():
     ...:     print(df)
     ...:
    A   B
0   3  11
1   0  16
2  16   1
3   2  11
4  18  15

Currently, you're selecting the entire series for comparison.目前,您正在选择整个系列进行比较。 To get an individual value from the series, you'll want to use something along the lines of:要从系列中获取单个值,您需要使用以下内容:

for i in dfs:
if (i['var1'].iloc[0] < 3.000):
   print(i)

To compare each of the individual elements you can use series.iteritems (documentation is sparse on this one) like so:要比较每个单独的元素,您可以使用series.iteritems (文档很少),如下所示:

for i in dfs:
    for _, v in i['var1'].iteritems():
        if v < 3.000:
            print(v)

The better solution here for most cases is to select a subset of the dataframe to use for whatever you need, like so:对于大多数情况,这里更​​好的解决方案是选择数据帧的一个子集以用于您需要的任何内容,如下所示:

for i in dfs:
    subset = i[i['var1'] < 3.000]
    # do something with the subset

Performance in pandas is much faster on large dataframes when using series operations instead of iterating over individual values.当使用系列操作而不是迭代单个值时,pandas 在大型数据帧上的性能要快得多。 For more detail, you can check out the pandas documentation on selection.有关更多详细信息,您可以查看有关选择的 Pandas文档。

the comparison returns a range of values, you need to limit it either by any() or all(), for example,比较返回一系列值,您需要通过 any() 或 all() 对其进行限制,例如,

     if((df[col] == ' this is any string or list').any()):
       return(df.loc[df[col] == temp].index.values.astype(int)[0])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM