[英]How to delete DF rows based on multiple column conditions?
Here's an example of DF:这是一个DF的例子:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 VNF
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [1, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1]
2 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [-1, 0]
3 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, -1]
4 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [1, 0]
5 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [0, 1]
6 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [-1, 0]
How to delete those rows where df['VNF'] = [-1, 0] or [0, -1] and df['EC1'], df['EC2'] and df['CDC'] has a value of 0 in the same index position as the -1 in df['VNF'])?如何删除 df['VNF'] = [-1, 0] 或 [0, -1] 和 df['EC1']、df['EC2'] 和 df['CDC'] 具有值的那些行0 在与 df['VNF'] 中的 -1 相同的索引 position 中?
The expected result would be:预期的结果是:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 VNF
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [1, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1]
2 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [1, 0]
3 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [0, 1]
4 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [-1, 0]
Here's the constructor for the DataFrame:这是 DataFrame 的构造函数:
data = {'EC1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [1, 0]],
'EC2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'CDC': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'L1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L3': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L4': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L5': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'L6': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'VNF': [[1, 0], [0, 1], [-1, 0], [0, -1], [1, 0], [0, 1], [-1, 0]]}
List comprehension to find which indexes to drop might help see the conditions more directly:列表理解以查找要删除的索引可能有助于更直接地查看条件:
columns = df.EC1, df.EC2, df.CDC, df.VNF
inds_to_drop = [iloc
for iloc, (ec1, ec2, cdc, vnf) in enumerate(zip(*columns))
if vnf == [-1, 0] or vnf == [0, -1]
if all(val[idx] == 0
for idx in (vnf.index(-1),) for val in (ec1, ec2, cdc))]
new_df = df.drop(df.index[inds_to_drop])
to get要得到
>>> new_df
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 VNF
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [1, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1]
4 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [1, 0]
5 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [0, 1]
6 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [-1, 0]
You can explode every column of df
, then identify the elements satisfying the first (sum of "VNF" values must be -1) and second condition and filter out the elements that satisfy both conditions to create temp
.您可以分解df
的每一列,然后识别满足第一个(“VNF”值之和必须为-1)和第二个条件的元素,并过滤掉满足这两个条件的元素以创建temp
。 Then since each cell must have two elements, you can count whether each index contains 2 elements by transforming count
, then filter the rows with two indices and groupby
the index and aggregate to list:然后由于每个单元格必须有两个元素,您可以通过转换count
来计算每个索引是否包含 2 个元素,然后过滤具有两个索引的行并按索引groupby
并聚合到列表:
exploded = df.explode(df.columns.tolist())
first_cond = exploded.groupby(level=0)['VNF'].transform('sum').eq(-1)
second_cond = exploded['VNF'].eq(-1) & exploded['EC1'].eq(0) & exploded['EC2'].eq(0) & exploded['CDC'].eq(0)
temp = exploded[~(first_cond & second_cond)]
out = temp[temp.groupby(level=0)['VNF'].transform('count').gt(1)].groupby(level=0).agg(list).reset_index(drop=True)
Output: Output:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 \
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0]
2 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
3 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
4 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
VNF
0 [1, 0]
1 [0, 1]
2 [1, 0]
3 [0, 1]
4 [-1, 0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.