简体   繁体   中英

How to split the column of a dataframe

I would like to split the column of a dataframe as follows. Here is the main dataframe.

import pandas as pd

df_az = pd.DataFrame(list(zip(storage_AZ)),columns =['AZ Combination'])
df_az            

AZ组合

Then, I applied this code to split the column.

out_az = (df_az.stack().apply(pd.Series).rename(columns=lambda x: f'a combination').unstack().swaplevel(0,1,axis=1).sort_index(axis=1))
out_az = pd.concat([out_az], axis=1)
out_az.head()

However, the result is as follows.

出AZ

Meanwhile, the expected result is:

预期结果

Could anyone help me what to change on the code, please? Thank you in advance.

Here is an example of how you could use the str.split() method to split the "AZ Combination" column of your DataFrame:

df_az["AZ Combination"].str.split("-", expand=True)

This will split the "AZ Combination" column on the "-" character and create new columns for each part of the split. The expand=True argument tells the method to create new columns for the split values, instead of returning a Series.

You can also assign the new columns to a new DataFrame:

df_split = df_az["AZ Combination"].str.split("-", expand=True)

You can also assign new column names to the new columns like this:

df_split.columns = ['Part1', 'Part2']

You could also merge the new dataframe with the original one like this:

df_az = pd.concat([df_az,df_split], axis=1)

It is important to note that this will split the column by the "-", If the separator is different, please adjust the code accordingly. Also, if the number of parts of the string is variable, you may need to adjust the number of columns accordingly.

You can apply np.ravel :

>>> pd.DataFrame.from_records(df_az['AZ Combination'].apply(np.ravel))

   0  1  2  3  4  5
0  0  0  0  0  0  0
1  0  0  0  0  0  1

Convert column to list and reshape for 2d array , so possible use Dataframe contructor.

Then set columns names, for avoid duplicated columns names are add counter:

storage_AZ = [[[0,0,0],[0,0,0]],
              [[0,0,0],[0,0,1]],
              [[0,0,0],[0,1,0]],
              [[0,0,0],[1,0,0]],
              [[0,0,0],[1,0,1]]]
df_az = pd.DataFrame(list(zip(storage_AZ)),columns =['AZ Combination'])
    

N = 3
L = ['a combination','z combination']
df = pd.DataFrame(np.array(df_az['AZ Combination'].tolist()).reshape(df_az.shape[0],-1))
df.columns = [f'{L[a]}_{b}' for a, b in zip(df.columns // N, df.columns % N)]
print(df)
   a combination_0  a combination_1  a combination_2  z combination_0  \
0                0                0                0                0   
1                0                0                0                0   
2                0                0                0                0   
3                0                0                0                1   
4                0                0                0                1   

   z combination_1  z combination_2  
0                0                0  
1                0                1  
2                1                0  
3                0                0  
4                0                1  

If need MultiIndex :

df = pd.concat({'AZ Combination':df}, axis=1)
print(df)
   AZ Combination                                                  \
  a combination_0 a combination_1 a combination_2 z combination_0   
0               0               0               0               0   
1               0               0               0               0   
2               0               0               0               0   
3               0               0               0               1   
4               0               0               0               1   

                                   
  z combination_1 z combination_2  
0               0               0  
1               0               1  
2               1               0  
3               0               0  
4               0               1  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM