简体   繁体   English

在 pandas 中迭代地创建多索引和多列 dataframe

[英]create iteratively multi index and multi columns dataframe in pandas

Let's say that I want to create a multi index and multi column dataframe:假设我要创建一个多索引和多列 dataframe:

                          X         Y
Planet Continent Country  A    B    C     D 
Earth     Europe England  0.3  0.5  0.6   0.8
          Europe Italy    0.1  0.2  0.4   1.2 
Mars      Tempe  Sirtys   3.2  4.5  2.3   4.2 

I want to do that by iteratively collecting each single row of the dataframe,我想通过迭代收集 dataframe 的每一行来做到这一点,

row1 =  np.array(['Earth', 'Europe', 'England', 0.3, 0.5, 0.6, 0.8])
row2 =  np.array(['Earth', 'Europe', 'Italy', 0.1, 0.2, 0.4, 1.2])

I know how, starting with rows, I can create a multi-column dataframe, and I know how I can create a multi-index one.我知道如何从行开始创建多列 dataframe,并且我知道如何创建多索引列。 But how can I create both?但是我怎样才能同时创建呢? Thanks谢谢

if you start from an empty dataframe define with multiindex index and columns (as known according to you):如果您从一个空的 dataframe 开始定义多索引索引和列(据您所知):

df = pd.DataFrame(index=pd.MultiIndex(levels=[[]]*3, 
                                      codes=[[]]*3, 
                                      names=['Planet','Continent','Country']), 
                 columns=pd.MultiIndex.from_tuples([('X','A'), ('X','B'),
                                                    ('Y','C'), ('Y', 'D')],))

Then you can just add each row like:然后你可以像这样添加每一行:

df.loc[tuple(row1[:3]), :]= row1[3:]
print (df)
                            X         Y     
                            A    B    C    D
Planet Continent Country                    
Earth  Europe    England  0.3  0.5  0.6  0.8

and again after:之后又一次:

df.loc[tuple(row2[:3]), :]= row2[3:]
print (df)
                            X         Y     
                            A    B    C    D
Planet Continent Country                    
Earth  Europe    England  0.3  0.5  0.6  0.8
                 Italy    0.1  0.2  0.4  1.2

but if you have a lot of rows available at once, the answer of @Yo_Chris will be way more easy但是如果您一次有很多行可用, @Yo_Chris的答案会更容易

row1 =  np.array(['Earth', 'Europe', 'England', 0.3, 0.5, 0.6, 0.8])
row2 =  np.array(['Earth', 'Europe', 'Italy', 0.1, 0.2, 0.4, 1.2])
# create a data frame and set index
df = pd.DataFrame([row1, row2]).set_index([0,1,2])
# set the index names
df.index.names = ['Planet', 'Continent', 'Country']
# create a multi-index and assign to columns
df.columns = pd.MultiIndex.from_tuples([('X', 'A'), ('X', 'B'), ('Y', 'C'), ('Y', 'D')])

                            X         Y     
                            A    B    C    D
Planet Continent Country                    
Earth  Europe    England  0.3  0.5  0.6  0.8
                 Italy    0.1  0.2  0.4  1.2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM