简体   繁体   中英

pandas dataframe append a column value to another pandas column which has a list elements

Input Dataframe as below

data = {

's_id' :[5,7,26,70.0,55,71.0,8.0,'nan','nan',4],
'r_id' : [[34, 44, 23, 11, 71], [53, 33, 73, 41], [17], [10, 31], [17], [75, 8],[7],[68],[50],[]]

}

df = pd.DataFrame.from_dict(data)
df
Out[240]: 
  s_id                  r_id
0    5  [34, 44, 23, 11, 71]
1    7      [53, 33, 73, 41]
2   26                  [17]
3   70              [10, 31]
4   55                  [17]
5   71               [75, 8]
6    8                   [7]
7  nan                  [68]
8  nan                  [50]
9    4                    []

Expected dataframe

data = {

's_id' :[5,7,26,70.0,55,71.0,8.0,'nan','nan',4],
'r_id' : [[5,34, 44, 23, 11, 71], [7,53, 33, 73, 41], [26,17], [70,10, 31], [55,17], [71,75, 8],[8,7],[68],[50],[4]]

}
df = pd.DataFrame.from_dict(data)
df
Out[241]: 
  s_id                     r_id
0    5  [5, 34, 44, 23, 11, 71]
1    7      [7, 53, 33, 73, 41]
2   26                 [26, 17]
3   70             [70, 10, 31]
4   55                 [55, 17]
5   71              [71, 75, 8]
6    8                   [8, 7]
7  nan                     [68]
8  nan                     [50]
9    4                      [4]

Need to populate the list column with the elements from S_id as the first element in the list column of r_id, I also have nan values and some of them are appearing as float columns, Thanking you.

I tried the following,

df['r_id'] = df["s_id"].apply(lambda x : x.append(df['r_id']) )

df['r_id'] = df["s_id"].apply(lambda x : [x].append(df['r_id'].values.tolist()))

If nan s are missing values use apply with convert value to one element list with converting to integers and filter for omit mising values:

data = {

's_id' :[5,7,26,70.0,55,71.0,8.0,np.nan,np.nan,4],
'r_id' : [[34, 44, 23, 11, 71], [53, 33, 73, 41], 
          [17], [10, 31], [17], [75, 8],[7],[68],[50],[]]
}

df = pd.DataFrame.from_dict(data)
    print (df)

f = lambda x : [int(x["s_id"])] + x['r_id'] if pd.notna(x["s_id"]) else x['r_id']
df['r_id'] = df.apply(f, axis=1)
print (df)
   s_id                     r_id
0   5.0  [5, 34, 44, 23, 11, 71]
1   7.0      [7, 53, 33, 73, 41]
2  26.0                 [26, 17]
3  70.0             [70, 10, 31]
4  55.0                 [55, 17]
5  71.0              [71, 75, 8]
6   8.0                   [8, 7]
7   NaN                     [68]
8   NaN                     [50]
9   4.0                      [4]

Another idea is filter column and apply function to non NaN s rows:

m = df["s_id"].notna()
f = lambda x : [int(x["s_id"])] + x['r_id']
df.loc[m, 'r_id'] = df[m].apply(f, axis=1)
print (df)
   s_id                     r_id
0   5.0  [5, 34, 44, 23, 11, 71]
1   7.0      [7, 53, 33, 73, 41]
2  26.0                 [26, 17]
3  70.0             [70, 10, 31]
4  55.0                 [55, 17]
5  71.0              [71, 75, 8]
6   8.0                   [8, 7]
7   NaN                     [68]
8   NaN                     [50]
9   4.0                      [4]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM