简体   繁体   中英

How to make each row in dataframe have one value for each column?

I have the following dataframe which has the columns ID_x and ID_y that contain data separated with a single space:

df = pd.DataFrame({
    'fruit':['apple','orange','banana'],
    'ID_x' : ['1 2 3','4','5'],  
    'ID_y' : ['A B', 'C D','E']
    }, index=['0','1','2'])

在此处输入图片说明

I want to split each value in the columns ( ID_x and ID_y ) and create new rows such that each row represents one-to-one correspondence of the split values.

Something like this:

在此处输入图片说明

Any idea how to tackle this problem?

What I have tried so far splitting the values in the columns:

col_x = 'ID_x'
col_y = 'ID_y'

df = df_unflat.assign(**{col_x:df_unflat[col_x].str.split(' ')})
df = df_unflat.assign(**{col_y:df_unflat[col_y].str.split(' ')})

Try this way out:

import pandas as pd
df = pd.DataFrame({
    'fruit':['apple','orange','banana'],
    'ID_x' : ['1 2 3','4','5'],  
    'ID_y' : ['A B', 'C D','E']
    }, index=['0','1','2'])
id_x = df['ID_x'].str.split(' ').apply(Series, 1).stack()
id_y = df['ID_y'].str.split(' ').apply(Series, 1).stack()
id_x.index = id_x.index.droplevel(-1)
id_y.index = id_y.index.droplevel(-1)
id_x.name = 'ID_x'
id_y.name = 'ID_y'
del df['ID_x']
del df['ID_y']
df = df.join(id_x)
df = df.join(id_y)
df.reset_index(drop=True)

Output:

    fruit   ID_x    ID_y
0   apple   1       A
1   apple   1       B
2   apple   2       A
3   apple   2       B
4   apple   3       A
5   apple   3       B
6   orange  4       C
7   orange  4       D
8   banana  5       E
import itertools
#convert DF values to a numpy array, get all combinations between ID_x, ID_y and fruit, finally reconstruct the Dataframe.    
pd.DataFrame(sum([list(itertools.product(e[0].split(),e[1].split(),[e[2]])) for e in df.values],[]), columns=df.columns)
Out[483]: 
  ID_x ID_y   fruit
0    1    A   apple
1    1    B   apple
2    2    A   apple
3    2    B   apple
4    3    A   apple
5    3    B   apple
6    4    C  orange
7    4    D  orange
8    5    E  banana

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM