繁体   English   中英

R : 根据条件用一列的值或更新值填充 NaN

[英]R : Fill NaN with value or update value of one column based on conditions

我有一个如下所示的数据框(示例)(8K 行和 1.6K 卖家)

# creat dataframe
df <- data.frame(name =c('Tom', 'Tom', 'Tom',Tom','Tom','jack','jack','jack','jack','jack','Malik'),
                week = c(1, 2, 3, 4, 5, 1, 2, 3, 4,5,1),
                sell = c(20, 21, 19, 18, 23,24, 36, 35, 46, 50,44),
                demand=c(28, 16, 43,NaN,NaN,30, 35, 35, 72,NaN, 60)
                 )

df$`demand-sell` <- df$demand - df$sell

df    

预期输出功能: 输出函数

 **For week = 4**<br/> In which I would like to fill NaN values of demand for week = 4 with sum of remaining demand (demand - sell) of week = 1,2,3 of the same seller (name)<br/><br/> **Note:**<br/> If week=4 demand is not NaN then add week=4 demand in (demand - sell) of week = 1,2,3<br/> {<b>ex</b> in case of name = jack }<br/><br/> **For week = 5:** <br/>In which I would like to fill NaN values of demand for week = 5 with sum of remaining demand (demand - sell) of week = 1,2,3,4 of the same seller (name) <br/>**Note:** <br/>If week=5 demand is not NaN then add week=5 demand in (demand - sell) of week = 1,2,3,4<br/>

预期输出(样本数据)

预期输出样本

更新正确答案:

问题是缺少is.nan(demand)声明:

以下是正确答案:

    df %>% 
        mutate(`demand-sell` = demand - sell) %>%
        group_by(name) %>% 
        mutate(demand=case_when(week == 4 & is.nan(demand) ~ sum(`demand-sell`[1:3]),
                                week == 4 & !is.nan(demand) ~ demand + sum(`demand-sell`[1:3]),
                                TRUE ~ demand)) %>% 
        mutate(`demand-sell`= case_when(week == 4 ~ demand-sell,
                                        TRUE ~ `demand-sell`)) %>% 
        mutate(demand = case_when(week == 5 ~ `demand-sell`[4],
                                  TRUE ~ demand)) %>% 
        mutate(`demand-sell`= case_when(week == 5 ~ demand-sell,
                                        TRUE ~ `demand-sell`))

正确的输出:

   name   week  sell demand `demand-sell`
   <chr> <dbl> <dbl>  <dbl>         <dbl>
 1 Tom       1    20     28             8
 2 Tom       2    21     16            -5
 3 Tom       3    19     43            24
 4 Tom       4    18     27             9
 5 Tom       5    23      9           -14
 6 jack      1    24     30             6
 7 jack      2    36     35            -1
 8 jack      3    35     35             0
 9 jack      4    46     77            31
10 jack      5    50     31           -19
11 Malik     1    44     60            16

第一个答案:这是一个解决方案:至少对Tom来说是正确的。 我不知道你的jack例外输出是否正确。 如果所有name的逻辑相同,则应如下所示:

df %>% 
    mutate(`demand-sell` = demand - sell) %>%
    group_by(name) %>% 
    mutate(demand=case_when(week == 4 ~ sum(`demand-sell`[1:3]),
                            TRUE ~ demand)) %>% 
    mutate(`demand-sell`= case_when(week == 4 ~ demand-sell,
                                    TRUE ~ `demand-sell`)) %>% 
    mutate(demand = case_when(week == 5 ~ `demand-sell`[4],
           TRUE ~ demand)) %>% 
    mutate(`demand-sell`= case_when(week == 5 ~ demand-sell,
                                    TRUE ~ `demand-sell`))

输出:

   name   week  sell demand `demand-sell`
   <chr> <dbl> <dbl>  <dbl>         <dbl>
 1 Tom       1    20     28             8
 2 Tom       2    21     16            -5
 3 Tom       3    19     43            24
 4 Tom       4    18     27             9
 5 Tom       5    23      9           -14
 6 jack      1    24     30             6
 7 jack      2    36     35            -1
 8 jack      3    35     35             0
 9 jack      4    46      5           -41
10 jack      5    50    -41           -91
11 Malik     1    44     60            16

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM