简体   繁体   English

检查熊猫数据帧索引中是否存在值

[英]Check if a value exists in pandas dataframe index

I am sure there is an obvious way to do this but cant think of anything slick right now.我相信有一种明显的方法可以做到这一点,但现在想不出任何巧妙的方法。

Basically instead of raising exception I would like to get True or False to see if a value exists in pandas df index.基本上,我想获得TrueFalse以查看 pandas df索引中是否存在值,而不是引发异常。

import pandas as pd
df = pd.DataFrame({'test':[1,2,3,4]}, index=['a','b','c','d'])
df.loc['g']  # (should give False)

What I have working now is the following我现在的工作如下

sum(df.index == 'g')

这应该可以解决问题

'g' in df.index

Just for reference as it was something I was looking for, you can test for presence within the values or the index by appending the ".values" method, eg仅供参考,因为它是我正在寻找的东西,您可以通过附加“.values”方法来测试值或索引中是否存在,例如

g in df.<your selected field>.values
g in df.index.values

I find that adding the ".values" to get a simple list or ndarray out makes exist or "in" checks run more smoothly with the other python tools.我发现添加“.values”以获得一个简单的列表或 ndarray 使得存在或“in”检查与其他 python 工具一起运行更顺畅。 Just thought I'd toss that out there for people.只是想我会把它扔给人们。

Multi index works a little different from single index.多索引的工作原理与单索引略有不同。 Here are some methods for multi-indexed dataframe.以下是多索引数据帧的一些方法。

df = pd.DataFrame({'col1': ['a', 'b','c', 'd'], 'col2': ['X','X','Y', 'Y'], 'col3': [1, 2, 3, 4]}, columns=['col1', 'col2', 'col3'])
df = df.set_index(['col1', 'col2'])

in df.index works for the first level only when checking single index value. in df.index仅在检查单个索引值in df.index适用于第一级。

'a' in df.index     # True
'X' in df.index     # False

Check df.index.levels for other levels.检查df.index.levels以了解其他级别。

'a' in df.index.levels[0] # True
'X' in df.index.levels[1] # True

Check in df.index for an index combination tuple.检查df.index以获取索引组合元组。

('a', 'X') in df.index  # True
('a', 'Y') in df.index  # False

with DataFrame: df_data使用数据帧:df_data

>>> df_data
  id   name  value
0  a  ampha      1
1  b   beta      2
2  c     ce      3

I tried:我试过:

>>> getattr(df_data, 'value').isin([1]).any()
True
>>> getattr(df_data, 'value').isin(['1']).any()
True

but:但:

>>> 1 in getattr(df_data, 'value')
True
>>> '1' in getattr(df_data, 'value')
False

So fun :D太有趣了 :D

Code below does not print boolean, but allows for dataframe subsetting by index... I understand this is likely not the most efficient way to solve the problem, but I (1) like the way this reads and (2) you can easily subset where df1 index exists in df2:下面的代码不打印布尔值,但允许按索引对数据帧进行子集设置......我知道这可能不是解决问题的最有效方法,但我(1)喜欢这种读取方式和(2)你可以轻松地子集其中 df1 索引存在于 df2 中:

df3 = df1[df1.index.isin(df2.index)]

or where df1 index does not exist in df2...或者 df2 中不存在 df1 索引...

df3 = df1[~df1.index.isin(df2.index)]
df = pandas.DataFrame({'g':[1]}, index=['isStop'])

#df.loc['g']

if 'g' in df.index:
    print("find g")

if 'isStop' in df.index:
    print("find a") 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM