如何从 Pandas DataFrame 的列内的列表中删除值

Question

Although not good coding practice, I've come to an special kind of problem, in which I need to go through a column of lists to erase particular values.虽然不是很好的编码习惯，但我遇到了一种特殊的问题，我需要通过一列列表来 go 擦除特定值。 I suppose one resolution could be managed with melting the 'neighbors' column, but I believe the code I've managed is close from the objective.我想可以通过融化“邻居”列来管理一项解决方案，但我相信我管理的代码与目标很接近。 I've prepared a reproducible example for better understanding:为了更好地理解，我准备了一个可重现的示例：

import pandas as pd
import numpy as np


def removing_nan_neighboors(custom_df):
    nan_list = list(custom_df[custom_df['values'].notna()]['customer'])
    print(nan_list)
    custom_df['neighbors'] = [x for x in custom_df['neighbors'] if x not in nan_list]
    return custom_df


customer = [1, 2, 3, 4, 5, 6]
values = [np.nan, np.nan, 10, np.nan, 11, 12]
neighbors = [[6, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 1]]
df = pd.DataFrame({'customer': customer, 'values': values, 'neighbors': neighbors})
df = removing_nan_neighboors(df)

print(df)

   customer values neighbors
0        1     NaN    [6, 2]
1        2     NaN    [1, 3]
2        3    10.0    [2, 4]
3        4     NaN    [3, 5]
4        5    11.0    [4, 6]
5        6    12.0    [5, 1]

The objective is to erase the customer numbers from the neighbors, if they have NaN values:目标是从邻居中删除客户编号，如果它们具有 NaN 值：

   customer values neighbors
0        1     NaN    [6]
1        2     NaN    [3]
2        3    10.0    []
3        4     NaN    [3, 5]
4        5    11.0    [6]
5        6    12.0    [5]

But I have failed to get that far, for my function doesn't work as intended yet.但我没能走到那一步，因为我的 function 还没有按预期工作。 Help is appreciated.帮助表示赞赏。

Answer 1

Try:尝试：

df["cust_1"] = np.where(
    np.isnan(np.roll(df["values"], 1)),
    np.nan,
    np.roll(df["customer"], 1),
)

df["cust_2"] = np.where(
    np.isnan(np.roll(df["values"], -1)),
    np.nan,
    np.roll(df["customer"], -1),
)

df["neighbors"] = df[["cust_1", "cust_2"]].agg(
    lambda x: list(x[x.notna()].astype(int)), axis=1
)
df = df.drop(columns=["cust_1", "cust_2"])

print(df)

Prints:印刷：

   customer  values neighbors
0         1     NaN       [6]
1         2     NaN       [3]
2         3    10.0        []
3         4     NaN    [3, 5]
4         5    11.0       [6]
5         6    12.0       [5]

Answer 2

If I understood your objective correctly, you want to erase such numbers from every neighbors row that belong to that customer rows, where values is NaN .如果我正确理解了您的目标，您希望从属于该customer行的每个neighbors行中删除此类数字，其中values NaN 。 So basically you want to get the result from your last cell.所以基本上你想从你的最后一个单元格中得到结果。

I attempted to do that in a list comprehension approach:我试图在列表理解方法中做到这一点：

df['neighbors_new'] = [[n for n in neighbor 
                        if n not in df[df['values'].isna() == True]['customer'].to_list()] 
                       for neighbor in df.neighbors]

And got this:得到了这个：

   customer  values neighbors neighbors_new
0         1     NaN    [6, 2]           [6]
1         2     NaN    [1, 3]           [3]
2         3    10.0    [2, 4]            []
3         4     NaN    [3, 5]        [3, 5]
4         5    11.0    [4, 6]           [6]
5         6    12.0    [5, 1]           [5]

Answer 3

In your case do explode then isin keep the notna在你的情况下explode ，然后notna isin

s = df['neighbors'].explode()
df['new'] = s[s.isin(df.loc[df['values'].notna(),'customer'])].groupby(level=0).agg(list)
df
Out[36]: 
   customer  values neighbors     new
0         1     NaN    [6, 2]     [6]
1         2     NaN    [1, 3]     [3]
2         3    10.0    [2, 4]     NaN
3         4     NaN    [3, 5]  [3, 5]
4         5    11.0    [4, 6]     [6]
5         6    12.0    [5, 1]     [5]

如何从 Pandas DataFrame 的列内的列表中删除值

问题描述

3 个解决方案

解决方案1
0 2022-09-22 01:10:35

解决方案2
0 2022-09-22 01:16:10

解决方案3
0 2022-09-22 01:26:07

如何从 Pandas DataFrame 的列内的列表中删除值

问题描述

3 个解决方案

解决方案1 0 2022-09-22 01:10:35

解决方案2 0 2022-09-22 01:16:10

解决方案3 0 2022-09-22 01:26:07

解决方案1
0 2022-09-22 01:10:35

解决方案2
0 2022-09-22 01:16:10

解决方案3
0 2022-09-22 01:26:07