[英]checking if pandas dataframe is indexed?
Is it possible to check if a pandas dataframe is indexed?是否可以检查 pandas dataframe 是否已编入索引? Check if
DataFrame.set_index(...)
was ever called on the dataframe?检查
DataFrame.set_index(...)
是否曾在 dataframe 上调用过? I could check if df.index
is a numeric list but that's not a perfect test for this.我可以检查
df.index
是否是一个数字列表,但这并不是一个完美的测试。
One way would be to compare it to the plain Index: 一种方法是将其与普通指数进行比较:
pd.Index(np.arange(0, len(df))).equals(df.index)
For example: 例如:
In [11]: df = pd.DataFrame([['a', 'b'], ['c', 'd']], columns=['A', 'B'])
In [12]: df
Out[12]:
A B
0 a b
1 c d
In [13]: pd.Index(np.arange(0, len(df))).equals(df.index)
Out[13]: True
and if it's not the plain index, it will return False: 如果它不是普通索引,它将返回False:
In [14]: df = df.set_index('A')
In [15]: pd.Index(np.arange(0, len(df))).equals(df.index)
Out[15]: False
I just ran into this myself.我自己也遇到过这个。 The problem is that a dataframe is indexed before calling
.set_index()
, so the question is really whether or not the index is named .问题是 dataframe在调用
.set_index()
之前被索引,所以问题实际上是索引是否被命名。 In which case, df.index.name
appears to be less reliable than df.index.names
在这种情况下,
df.index.name
似乎不如df.index.names
可靠
>>> import pandas as pd
>>> df = pd.DataFrame({"id1": [1, 2, 3], "id2": [4,5,6], "word": ["cat", "mouse", "game"]})
>>> df
id1 id2 word
0 1 4 cat
1 2 5 mouse
2 3 6 game
>>> df.index
RangeIndex(start=0, stop=3, step=1)
>>> df.index.name, df.index.names[0]
(None, None)
>>> "indexed" if df.index.names[0] else "no index"
'no index'
>>> df1 = df.set_index("id1")
>>> df1
id2 word
id1
1 4 cat
2 5 mouse
3 6 game
>>> df1.index
>>> df1.index.name, df1.index.names[0]
('id1', 'id1')
Int64Index([1, 2, 3], dtype='int64', name='id1')
>>> "indexed" if df1.index.names[0] else "no index"
'indexed'
>>> df12 = df.set_index(["id1", "id2"])
>>> df12
word
id1 id2
1 4 cat
2 5 mouse
3 6 game
>>> df12.index
MultiIndex([(1, 4),
(2, 5),
(3, 6)],
names=['id1', 'id2'])
>>> df12.index.name, df12.index.names[0]
(None, 'id1')
>>> "indexed" if df12.index.names[0] else "no index"
'indexed'
The following worked for me, I do set_index([label], append=False) if the dataframe has the default RangeIndex, or set_index([label], append=True) otherwise.以下对我有用,如果 dataframe 具有默认的 RangeIndex,我会执行 set_index([label], append=False),否则我会执行 set_index([label], append=True)。
append = not isinstance(df.index, pd.RangeIndex)
df.set_index([label], drop=True, append=append, inplace=True)
So my assumption, is that when index is the default RangeIndex, that setting another column as an index, I can drop the RangeIndex.所以我的假设是,当索引是默认的 RangeIndex 时,将另一列设置为索引,我可以删除 RangeIndex。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.