I am trying to slice a dataframe with index but it is giving error as 'TypeError: 'Int64Index([1], dtype='int64')' is an invalid key'
data = [['Alex', 10], ['Bob', 12], ['Clarke', 13]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
index = df.index[df['Name'] == 'Bob']
print(index)
df = df.loc[index:]
Error:
df = df.loc[index:]
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1867, in _getitem_axis
return self._get_slice_axis(key, axis=axis)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1533, in _get_slice_axis
slice_obj.step, kind=self.name)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4672, in slice_indexer
kind=kind)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4871, in slice_locs
start_slice = self.get_slice_bound(start, 'left', kind)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4801, in get_slice_bound
slc = self._get_loc_only_exact_matches(label)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4771, in _get_loc_only_exact_matches
return self.get_loc(key)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 2656, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 110, in pandas._libs.index.IndexEngine.get_loc
TypeError: 'Int64Index([1], dtype='int64')' is an invalid key
Printing the index is giving 'Int64Index([1], dtype='int64')' How can I convert it to int value.
No much documentation is available on https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Int64Index.html
To do this, you need to make sure that your index
variable contains just an integer, rather than some other object which may contain multiple values (if 'Bob' appears more than once). In this case it would only contain one value, since 'Bob' only appears once in your table, but what you get is an Int64Index
object which is capable of holding several integers. What you want is just a plain old integer.
The following should work for your table, and for a table where Bob does indeed appear multiple times (it will select the index for the first row in which 'Bob' appears):
index = (df['Name'] == 'Bob').idxmax()
The idxmax
function returns the index of the highest valued item in a series (and True
is higher than False
, so it returns the index where name is 'Bob'). In the case where there are two or more highest values, the first index is returned.
Try this if you want to get whole dataframe starting from this index:
df = df.loc[index[0]:]
If you are trying to get only the row by name try:
df = df[df['Name'] == 'Bob']
slight modification to your code
index = list(df.index[df['Name'] == 'Bob'])
should give you the postion. Let me know if it works
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.