简体   繁体   中英

Dataframe split row data into columns

I've a dataframe like this

Name Val
A    1;2;3;4;5
B    10;20;30;40;50
C    11;22;33;44;55
D    a;b;c;d;e
E    0.0;0.1;0.2;0.3;0.4

I need to convert it into a df like below

A   B   C   D   E
1   10  11  a   0.0
2   20  22  b   0.1
3   30  33  c   0.2
4   40  44  d   0.3
5   50  55  e   0.4

I wrote the below code to get the required output.

df_x = pd.DataFrame([['A','1;2;3;4;5'],
                 ['B','10;20;30;40;50'],
                 ['C','11;22;33;44;55'],
                 ['D','a;b;c;d;e'],
                 ['E', '0.0;0.1;0.2;0.3;0.4']], columns=['NAME','VAL'])
print(df_x, '\n')

new_dict = dict()

for idx,row in df_x.iterrows():
    new_dict[row['NAME']] = row['VAL'].split(';')

df_y = pd.DataFrame(new_dict)
print(df_y)

But if there are thousands of data in VAL column, then I suspect this is not a very efficient way to get the output. Is there any other way to make this more efficient? ( like not using a separate dictionary and try something within the dataframe or anyother way )

Use DataFrame.set_index with Series.str.split and transpose by DataFrame.T :

df = df_x.set_index('NAME')['VAL'].str.split(';', expand=True).rename_axis(None).T
print(df, '\n')
   A   B   C  D    E
0  1  10  11  a  0.0
1  2  20  22  b  0.1
2  3  30  33  c  0.2
3  4  40  44  d  0.3
4  5  50  55  e  0.4 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM