简体   繁体   English

Pandas 单列内多个条件

[英]Pandas multiple conditions within a single column

In the below df, I need to replace the COST A & COST B for E to 0 and replace Comment as Un reported cost , when below conditions are met -在下面的df中,当满足以下条件时,我需要将ECOST ACOST B替换为0并将 Comment 替换为Un reported cost -

  1. E and F have the same cost for 'COST A' E 和 F 对于“COST A”具有相同的成本
  2. E and F have the same cost for 'COST B' E 和 F 对于“COST B”的成本相同

as you can see 20 and 0.5 for E is replace with 0, as E and F have the same cost如您所见,E 的 20 和 0.5 替换为 0,因为 E 和 F 的成本相同

df = pd.DataFrame([['1/1/2021','SKU_1','A','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','B','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','C','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','D','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','E','0.05','20','Calculated'],
                   ['1/1/2021','SKU_1','F','0.05','20','Actual']],
                   columns = ['MTH-YR','SKU','TYPE','COST A','COST B','COMMENT'])

Expected result,预期结果,

     MTH-YR     SKU   TYPE COST A  COST B  COMMENT
0   1/1/2021    SKU_1   A   0   0   Un reported cost
1   1/1/2021    SKU_1   B   0   0   Un reported cost
2   1/1/2021    SKU_1   C   0   0   Un reported cost
3   1/1/2021    SKU_1   D   0   0   Un reported cost
4   1/1/2021    SKU_1   E   0   0   Un reported cost
5   1/1/2021    SKU_1   F   0.5 20  Actual
for i in range (len(df["COST A"])):
    # First check for the index of E:
    if df["TYPE"][i] == "E":
        # Then we have two conditions that will be true if E == F
        if df["COST A"][i] == df["COST A"][i+1] and df["COST B"][i] == df["COST B"][i+1]:
            # Now we replace the old values           
            df["COST A"][i] = 0
            df["COST B"][i] = 0
            df["COMMENT"][i] = "Un reported cost"

If this is just for this small dataset, use .iloc method.如果这只是针对这个小数据集,请使用.iloc方法。

## Change COST B for E to 0
df.iloc[4,4] = 0

## Change COMMENT for E to 'Un reported cost'
df.iloc[4,5] = 'Un reported cost'

UPDATE: I was playing around with this before going to sleep, and came across this solution on a larger data.更新:我在睡觉前正在玩这个,并且在更大的数据上遇到了这个解决方案。

Let's define your data.让我们定义您的数据。 I have added duplicate E's here.我在这里添加了重复的 E。

## Create dataframe
df = pd.DataFrame([['1/1/2021','SKU_1','A','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','B','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','C','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','D','0','0','Un reported cost'],
                   ['1/1/2021','SKU_1','E','0.05','20','Calculated'],
                   ['1/1/2021','SKU_1','E','0.05','20','Calculated'],
                   ['1/1/2021','SKU_1','E','0.05','20','Calculated'],
                   ['1/1/2021','SKU_1','F','0.05','20','Actual']],
                   columns = ['MTH-YR','SKU','TYPE','COST A','COST B','COMMENT'])

Using list comprehension, we can assign values to COST B based on the conditions that you need.使用列表推导,我们可以根据您需要的条件为COST B赋值。

df['COST B'] = ['0' if df['TYPE'][x] == 'E' else df['COST B'][x] for x in range(len(df['TYPE']))]

In essence, we're just looping through the values in TYPE to see which ones are type 'E'.本质上,我们只是遍历TYPE中的值以查看哪些是类型“E”。 When found, we're changing that value to 0. Otherwise, we're keeping the value stored in COST B .找到后,我们将该值更改为 0。否则,我们将该值保存在COST B中。 Same logic is applied for COMMENT .相同的逻辑适用于COMMENT

df['COMMENT'] = ['Un reported cost' if df['TYPE'][x] == 'E' else df['COMMENT'][x] for x in range(len(df['TYPE']))]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM