简体   繁体   English

从 Pandas 数据框中获取列名,包括索引名

[英]Get column names including index name from pandas data frame

Assuming that we have a data frame with an index that might have a name:假设我们有一个数据框,其中的索引可能有一个名称:

df = pd.DataFrame({'a':[1,2,3],'b':[3,6,1], 'c':[2,6,0]})
df = df.set_index(['a'])

   b  c
a      
1  3  2
2  6  6

What is the best way to get the column names that will include the index name if it is present.获取包含索引名称(如果存在)的列名称的最佳方法是什么。

Calling df.columns.tolist() do not include the index name and return ['b', 'c'] in this case, and I would like to obtain ['a', 'b', 'c'] .在这种情况下,调用df.columns.tolist()不包含索引名称并返回['b', 'c'] ,我想获得['a', 'b', 'c']

The index can be temporarily reset for the call:可以为呼叫临时重置索引:

df.reset_index().columns.tolist()

If an empty index name is not to appear in the list, do the reset_index() conditionally:如果空索引名称不会出现在列表中,请reset_index()执行reset_index()

(df.reset_index() if df.index.name else df).columns.tolist()

For universal solution need filter None if not exist index.name :对于通用解决方案,如果不存在index.name则需要过滤器None

df = pd.DataFrame({'a':[1,2,3],'b':[3,6,1], 'c':[2,6,0]})

print ([df.index.name] + df.columns.tolist())
[None, 'a', 'b', 'c']

c = list(filter(None, [df.index.name] + df.columns.tolist()))
print (c)
['a', 'b', 'c']

df = df.set_index(['a'])

c = list(filter(None, [df.index.name] + df.columns.tolist()))
print (c)
['a', 'b', 'c']

Another solution with numpy.insert and difference : numpy.insertdifference另一个解决方案:

c = np.insert(df.columns, 0, df.index.name).difference([None]).tolist()
print (c)

['a', 'b', 'c']

You can use list with filter after elevating your index via reset_index :通过reset_index提升索引后,您可以使用带filter list

res = list(filter(None, df.reset_index()))

print(res)

['a', 'b', 'c']

I think with more recent versions of pandas this answer might be more concise:我认为对于更新版本的熊猫,这个答案可能更简洁:

names = list(filter(None, df.index.names + df.columns.values.tolist()))

This works for no index, single column Index, or MultiIndex.这适用于无索引、单列索引或多索引。 It avoids calling reset_index() which has an unnecessary performance hit for such a simple operation.它避免了调用 reset_index() ,因为这样一个简单的操作会对性能造成不必要的影响。

我想,你是从 group by operation 中得到的,我想如果是这样的话,你需要在最后添加 reset_index() 以通过常规方法获取列名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM