[英]In a multi-indexed dataframe .columns.levels[1] after groupby gives the columns of the whole dataframe
Lets say I have three data frames that I concatenate horizontally with the help of multi-index: 可以说我有三个数据帧,这些数据帧是在多索引的帮助下水平连接的:
df1 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)), columns=list('ABCDE'))
df2 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)),columns=list('AGHIJ'))
df3 = pd.DataFrame(data=np.random.randint(0, 100, (4, 5)),columns=list('ALMNP'))
dfs = []
dfs.append(df1)
dfs.append(df2)
dfs.append(df3)
result = pd.concat(dfs, axis=1, keys=range(len(dfs)))
if I groupby the by the first index I should get my first dataframe and If I look at the list of its columns it should be ABCD but it is not the case. 如果我按第一个索引分组,我应该得到我的第一个数据帧,如果我查看它的列列表,则应该是ABCD,但事实并非如此。
print(result.groupby(axis=1, level=0).get_group(0).columns.levels[1])
gives me all the columns of df1, df2 and df3 给我df1,df2和df3的所有列
I will use get_level_values
, since levels
keep all category of original dataframe columns 我将使用get_level_values
,因为levels
保留了原始数据get_level_values
列的所有类别
result.groupby(axis=1, level=0).get_group(0).columns.get_level_values(1)
Out[1296]: Index(['A', 'B', 'C', 'D', 'E'], dtype='object')
This is an issue with unused levels
. 这是unused levels
的问题。 When you have a MultiIndex, the levels are still there, just unused, so you can remove them if needed: 当您拥有MultiIndex时,级别仍然存在,只是未使用,因此您可以根据需要将其删除:
result.groupby(axis=1, level=0).get_group(0).columns.remove_unused_levels().levels[1]
#Index(['A', 'B', 'C', 'D', 'E'], dtype='object')
To see that everything is still there, look at the columns. 要查看一切仍然存在,请查看各列。 There are still 13 values for the first level, but this group only references the first 5. 第一级仍然有13个值,但是该组仅引用前5个值。
print(result.groupby(axis=1, level=0).get_group(0).columns)
#MultiIndex(levels=[[0, 1, 2], ['A', 'B', 'C', 'D', 'E', 'G', 'H', 'I', 'J', 'L', 'M', 'N', 'P']],
# codes=[[0, 0, 0, 0, 0], [0, 1, 2, 3, 4]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.