Create a new column of lists in Pandas dataframe with unique values from another column

Question

I have a dataframe with a column of lists:

    full_list_to_check
 0          NaN 
 1          NaN 
 2    [1, 2, 3, 4, 5] 
 3        [6, 6] 
 4        [11, 11]

I need to create a new column where it shows a distinct list for each row if duplicates exist in the list, otherwise just the same list.

  full_list_to_check            new_col
 0          NaN                   NaN
 1          NaN                   NaN
 2    [1, 2, 3, 4, 5]           [1, 2, 3, 4, 5]
 3        [6, 6]                  [6]
 4        [11, 11]                [11]

I have tried this:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)))

But I get this error:

TypeError: 'float' object is not iterable

Answer 1

You must check Nan :

df['full_list_to_check'].apply(lambda x: list(set(x)) if not np.any(pd.isna(x)) else np.nan)

Update:

df['full_list_to_check'].apply(lambda x: list(set(x)) if x is not np.nan else np.nan)

0                NaN
1                NaN
2    [1, 2, 3, 4, 5]
3                [6]
4               [11]

Answer 2

You can try this:

df['new_col'] = df.loc[~df['full_list_to_check'].isna(), 'full_list_to_check'].apply(lambda x: list(set(x)))

full_list_to_check new_col
0 NaN              NaN
1 NaN              NaN
2 [1, 2, 3, 4, 5]  [1, 2, 3, 4, 5]
3 [6, 6]           [6]
4 [11, 11]         [11]

Answer 3

You could use:

df['new_col'] = df['full_list_to_check'].apply(lambda x: list(set(x)) if isinstance(x,list) else x)

The other answers only works if there are no other values then lists or NaN in your data.

Create a new column of lists in Pandas dataframe with unique values from another column

Question

3 answers

solution1
2 2020-01-15 11:43:50

solution2
2 2020-01-15 12:00:32

solution3
2 2020-01-15 12:15:06

Create a new column of lists in Pandas dataframe with unique values from another column

Question

3 answers

solution1 2 2020-01-15 11:43:50

solution2 2 2020-01-15 12:00:32

solution3 2 2020-01-15 12:15:06

solution1
2 2020-01-15 11:43:50

solution2
2 2020-01-15 12:00:32

solution3
2 2020-01-15 12:15:06