简体   繁体   中英

Create a new column of lists in Pandas dataframe with unique values from another column

I have a dataframe with a column of lists:

    full_list_to_check
 0          NaN 
 1          NaN 
 2    [1, 2, 3, 4, 5] 
 3        [6, 6] 
 4        [11, 11] 

I need to create a new column where it shows a distinct list for each row if duplicates exist in the list, otherwise just the same list.

  full_list_to_check            new_col
 0          NaN                   NaN
 1          NaN                   NaN
 2    [1, 2, 3, 4, 5]           [1, 2, 3, 4, 5]
 3        [6, 6]                  [6]
 4        [11, 11]                [11]

I have tried this:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)))

But I get this error:

TypeError: 'float' object is not iterable

You must check Nan :

df['full_list_to_check'].apply(lambda x: list(set(x)) if not np.any(pd.isna(x)) else np.nan)

Update:

df['full_list_to_check'].apply(lambda x: list(set(x)) if x is not np.nan else np.nan)
0                NaN
1                NaN
2    [1, 2, 3, 4, 5]
3                [6]
4               [11]

You can try this:

df['new_col'] = df.loc[~df['full_list_to_check'].isna(), 'full_list_to_check'].apply(lambda x: list(set(x)))
full_list_to_check new_col
0 NaN              NaN
1 NaN              NaN
2 [1, 2, 3, 4, 5]  [1, 2, 3, 4, 5]
3 [6, 6]           [6]
4 [11, 11]         [11]

You could use:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)) if isinstance(x,list) else x)

The other answers only works if there are no other values then lists or NaN in your data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM