I want to creat a multi-index dataframe with the following:
column_1=['A','B','C','D']
column_2=[['a','b'],'c',['d','e'],['f','i','j']]
value_1=[1,2,3,4]
value_2=[5,6,7,8]
df=pd.DataFrame({"Column_1":column_1,
"Column_2":column_2,
"Value_1":value_1,
"Value_2":value_2},
index=pd.MultiIndex.from_arrays(list(zip(*[column_1,column_2]))),
columns=["Value_1","Value_2"])
df
Then I got the error:
ValueError: setting an array element with a sequence
I searched a bit and I think the reason is pd does not understand column_1 does not have the same length as column_2. But how can I fix this? I want something like:
Value_1 Value_2
A a
b
B c
.....
Well, I do not know if I have completely understand you. But what about something like this?
df = pd.concat([pd.Series(row['Column_1'], list(row['Column_2']))
for i, row in df.iterrows()]).reset_index()
df.columns = ['Column_2', 'Column_1']
df.set_index(['Column_1', 'Column_2'], inplace=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.