简体   繁体   中英

creating a multiindex dataframe with a list of tuples of lists

I want to creat a multi-index dataframe with the following:

column_1=['A','B','C','D']
column_2=[['a','b'],'c',['d','e'],['f','i','j']]
value_1=[1,2,3,4]
value_2=[5,6,7,8]
df=pd.DataFrame({"Column_1":column_1,
             "Column_2":column_2,
             "Value_1":value_1,
             "Value_2":value_2},
            index=pd.MultiIndex.from_arrays(list(zip(*[column_1,column_2]))),
            columns=["Value_1","Value_2"])
df

Then I got the error:

ValueError: setting an array element with a sequence

I searched a bit and I think the reason is pd does not understand column_1 does not have the same length as column_2. But how can I fix this? I want something like:

    Value_1   Value_2
A a
  b
B c

.....

Well, I do not know if I have completely understand you. But what about something like this?

df = pd.concat([pd.Series(row['Column_1'], list(row['Column_2']))
                    for i, row in df.iterrows()]).reset_index()
df.columns = ['Column_2', 'Column_1']

df.set_index(['Column_1', 'Column_2'], inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM