简体   繁体   English

熊猫数据帧按索引切片

[英]Pandas dataframe slice by index

I am trying to slice a dataframe with index but it is giving error as 'TypeError: 'Int64Index([1], dtype='int64')' is an invalid key' 我正在尝试对具有索引的数据帧进行切片,但是由于“ TypeError:'Int64Index([1],dtype ='int64')'是无效的键”而导致错误

data = [['Alex', 10], ['Bob', 12], ['Clarke', 13]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
index = df.index[df['Name'] == 'Bob']
print(index)
df = df.loc[index:]

Error: 错误:

df = df.loc[index:]
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1867, in _getitem_axis
return self._get_slice_axis(key, axis=axis)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1533, in _get_slice_axis
slice_obj.step, kind=self.name)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4672, in slice_indexer
kind=kind)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4871, in slice_locs
start_slice = self.get_slice_bound(start, 'left', kind)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4801, in get_slice_bound
slc = self._get_loc_only_exact_matches(label)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4771, in _get_loc_only_exact_matches
return self.get_loc(key)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 2656, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 110, in pandas._libs.index.IndexEngine.get_loc
TypeError: 'Int64Index([1], dtype='int64')' is an invalid key

Printing the index is giving 'Int64Index([1], dtype='int64')' How can I convert it to int value. 打印索引将得到'Int64Index([1],dtype ='int64')'如何将其转换为int值。

No much documentation is available on https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Int64Index.html https://pandas.pydata.org/pandas-docs/version/0.23.4/genic/pandas.Int64Index.html上没有太多文档可用。

To do this, you need to make sure that your index variable contains just an integer, rather than some other object which may contain multiple values (if 'Bob' appears more than once). 为此,您需要确保index变量仅包含一个整数,而不是其他可能包含多个值的对象(如果“ Bob”出现多次)。 In this case it would only contain one value, since 'Bob' only appears once in your table, but what you get is an Int64Index object which is capable of holding several integers. 在这种情况下,它仅包含一个值,因为“ Bob”在表中仅出现一次,但是您得到的是一个Int64Index对象,该对象能够容纳多个整数。 What you want is just a plain old integer. 您想要的只是一个普通的旧整数。

The following should work for your table, and for a table where Bob does indeed appear multiple times (it will select the index for the first row in which 'Bob' appears): 以下内容适用于您的表,以及确实存在多次Bob的表(它将为出现“ Bob”的第一行选择索引):

index = (df['Name'] == 'Bob').idxmax()

The idxmax function returns the index of the highest valued item in a series (and True is higher than False , so it returns the index where name is 'Bob'). idxmax函数返回一系列值最高的项目的索引(并且True高于False ,因此它返回名称为'Bob'的索引)。 In the case where there are two or more highest values, the first index is returned. 如果存在两个或多个最大值,则返回第一个索引。

Try this if you want to get whole dataframe starting from this index: 如果要从此索引开始获取整个数据帧,请尝试以下操作:

df = df.loc[index[0]:]

If you are trying to get only the row by name try: 如果您尝试仅按名称获取行,请尝试:

df = df[df['Name'] == 'Bob']

slight modification to your code 稍微修改您的代码

index = list(df.index[df['Name'] == 'Bob'])

should give you the postion. 应该给你的位置。 Let me know if it works 让我知道是否有效

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM