简体   繁体   English

pandas 数据框 - 返回 iloc 中的值,如果不存在则返回零

[英]pandas dataframe - return the value in iloc, or return zero if it does not exist

In using the iloc method for Pandas dataframe, I want to return zero if the value does not exist: (I have a query which it will always return either one row or an empty dataframe. I want the first left value when it exists)在对 Pandas 数据帧使用 iloc 方法时,如果值不存在,我想返回零:(我有一个查询,它总是返回一行或一个空数据帧。我想要第一个左值存在时)

import pandas as pd

mydict = {"col1":[1,2], "price":[1000,2000]}
df = pd.DataFrame(mydict)
query=df[df['price']>3000]

try:
    print(query.iloc[0][0])
except BaseException:
    print(0)

#print result: 0

Is there any better way or built-in method for iloc? iloc 有没有更好的方法或内置方法? I am thinking of something similar to the get method of Python dictionaries!我在想类似于 Python 字典的get方法的东西!

You can be more pythonic replacing your try/except block with:你可以更 pythonic 替换你的 try/except 块:

print(0 if len(query)==0 else query.iloc[0][0])

Explanation: len() applied to a pandas Dataframe returns the number of rows.说明:应用于熊猫数据帧的 len() 返回行数。

Update: as suggested in comments, query.empty this is more idiomatic and .iat is better for scalar lookups, hence:更新:正如评论中所建议的, query.empty这更惯用, .iat更适合标量查找,因此:

print(0 if query.empty else query.iat[0,0])

There's no intrinsically better way than try / except .没有比try / except更好的方法了。 The rationale for iloc is indexing by integer positional location. iloc的基本iloc是按整数位置位置进行索引。

The behaviour and functionality is consistent with NumPy np.ndarray , Python list and other indexable objects.行为和功能与 NumPy np.ndarray 、Python list和其他可索引对象一致。 There's no direct way to index either the first value of a list or return 0 if the list is empty.没有直接的方法可以索引列表的第一个值,或者如果列表为空则返回0

A slightly better way is to be explicit and catch IndexError only and use iat for accessing scalars by integer location.稍微好一点的方法是显式并仅捕获IndexError并使用iat通过整数位置访问标量。 Moreover, you can index by row and column simultaneously :此外,您可以同时按行和列索引:

try:
    print(query.iat[0, 0])
except IndexError:
    print(0)

You can probably use something like你可能可以使用类似的东西

next(iter(series, default))

For example, using your input例如,使用您的输入

In [1]: 
import pandas as pd
mydict = {"col1":[1,2], "price":[1000,2000]}
df = pd.DataFrame(mydict)
df
Out[1]: 
   col1  price
0     1   1000
1     2   2000

and filtering on price > 2000, gives the default value (which we are setting to zero) since df.loc[mask] would be empty并过滤价格 > 2000,给出默认值(我们将其设置为零),因为 df.loc[mask] 将为空

In [2]: 
mask = (df['price']>2000)
next(iter(df.loc[mask]['col1']), 0)
Out[2]: 
0

The other cases work as expected.其他情况按预期工作。 For example, filtering on price > 1500, gives 2例如,过滤价格 > 1500,给出 2

In [3]: 
mask = (df['price']>1500)
next(iter(df.loc[mask]['col1']), 0)
Out[3]: 
2

and filtering on price > 500 gives 1并过滤价格 > 500 给出 1

In [4]: 
mask = (df['price']>500)
next(iter(df.loc[mask]['col1']), 0)
Out[4]: 
1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM