Assuming that we have a data frame with an index that might have a name:
df = pd.DataFrame({'a':[1,2,3],'b':[3,6,1], 'c':[2,6,0]})
df = df.set_index(['a'])
b c
a
1 3 2
2 6 6
What is the best way to get the column names that will include the index name if it is present.
Calling df.columns.tolist()
do not include the index name and return ['b', 'c']
in this case, and I would like to obtain ['a', 'b', 'c']
.
The index can be temporarily reset for the call:
df.reset_index().columns.tolist()
If an empty index name is not to appear in the list, do the reset_index()
conditionally:
(df.reset_index() if df.index.name else df).columns.tolist()
For universal solution need filter None
if not exist index.name
:
df = pd.DataFrame({'a':[1,2,3],'b':[3,6,1], 'c':[2,6,0]})
print ([df.index.name] + df.columns.tolist())
[None, 'a', 'b', 'c']
c = list(filter(None, [df.index.name] + df.columns.tolist()))
print (c)
['a', 'b', 'c']
df = df.set_index(['a'])
c = list(filter(None, [df.index.name] + df.columns.tolist()))
print (c)
['a', 'b', 'c']
Another solution with numpy.insert
and difference
:
c = np.insert(df.columns, 0, df.index.name).difference([None]).tolist()
print (c)
['a', 'b', 'c']
You can use list
with filter
after elevating your index via reset_index
:
res = list(filter(None, df.reset_index()))
print(res)
['a', 'b', 'c']
I think with more recent versions of pandas this answer might be more concise:
names = list(filter(None, df.index.names + df.columns.values.tolist()))
This works for no index, single column Index, or MultiIndex. It avoids calling reset_index() which has an unnecessary performance hit for such a simple operation.
我想,你是从 group by operation 中得到的,我想如果是这样的话,你需要在最后添加 reset_index() 以通过常规方法获取列名。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.