简体   繁体   English

在多索引数据帧中,groupby之后的.columns.levels [1]给出整个数据帧的列

[英]In a multi-indexed dataframe .columns.levels[1] after groupby gives the columns of the whole dataframe

Lets say I have three data frames that I concatenate horizontally with the help of multi-index: 可以说我有三个数据帧,这些数据帧是在多索引的帮助下水平连接的:

df1 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)), columns=list('ABCDE'))
df2 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)),columns=list('AGHIJ'))
df3 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)),columns=list('ALMNP'))
dfs = []
dfs.append(df1)
dfs.append(df2)
dfs.append(df3)
result = pd.concat(dfs, axis=1, keys=range(len(dfs)))

if I groupby the by the first index I should get my first dataframe and If I look at the list of its columns it should be ABCD but it is not the case. 如果我按第一个索引分组,我应该得到我的第一个数据帧,如果我查看它的列列表,则应该是ABCD,但事实并非如此。

print(result.groupby(axis=1, level=0).get_group(0).columns.levels[1])

gives me all the columns of df1, df2 and df3 给我df1,df2和df3的所有列

I will use get_level_values , since levels keep all category of original dataframe columns 我将使用get_level_values ,因为levels保留了原始数据get_level_values列的所有类别

result.groupby(axis=1, level=0).get_group(0).columns.get_level_values(1)
Out[1296]: Index(['A', 'B', 'C', 'D', 'E'], dtype='object')

This is an issue with unused levels . 这是unused levels的问题。 When you have a MultiIndex, the levels are still there, just unused, so you can remove them if needed: 当您拥有MultiIndex时,级别仍然存在,只是未使用,因此您可以根据需要将其删除:

result.groupby(axis=1, level=0).get_group(0).columns.remove_unused_levels().levels[1]
#Index(['A', 'B', 'C', 'D', 'E'], dtype='object')

To see that everything is still there, look at the columns. 要查看一切仍然存在,请查看各列。 There are still 13 values for the first level, but this group only references the first 5. 第一级仍然有13个值,但是该组仅引用前5个值。

print(result.groupby(axis=1, level=0).get_group(0).columns)
#MultiIndex(levels=[[0, 1, 2], ['A', 'B', 'C', 'D', 'E', 'G', 'H', 'I', 'J', 'L', 'M', 'N', 'P']],
#           codes=[[0, 0, 0, 0, 0], [0, 1, 2, 3, 4]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在多索引pandas数据框中选择嵌套列 - How to select nested columns in a multi-indexed pandas dataframe 将多索引DataFrame的所有列乘以系列中的适当值 - Multiply all columns of a multi-indexed DataFrame by appropriate values in a Series Pivot pandas dataframe 具有多索引列 - Pivot pandas dataframe to have multi-indexed columns 如何将多索引数据帧(按多列分组的数据帧)转换为嵌套的 json - How to convert a multi-indexed dataframe, a dataframe grouped by multi columns to nested json 按其级别之一对多索引熊猫数据框进行分组? - Group a multi-indexed pandas dataframe by one of its levels? 根据级别之间的“AND”条件删除DataFrame的多索引行 - Drop multi-indexed rows of a DataFrame based on 'AND' condition between levels 如何将 go 从具有单级列的扁平化 dataframe 变回多索引 dataframe? - How to go from a flattened dataframe with single-level columns back to a multi-indexed dataframe? 将自定义函数应用于具有特定 groupby 的多索引熊猫数据框 - Apply custom function to multi-indexed pandas dataframe with specific groupby 多索引行和列 - Multi-indexed row and columns 尝试访问多索引数据框中的特定列,但出现长度不匹配错误 - Trying to access to specific columns in a multi-indexed dataframe but am getting a length mismatch error
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM