简体   繁体   English

屏蔽满足条件的行中的特定单元格值

[英]masking specific cell values in rows that meet condition

I am generating some frequencies for some survey questions and then putting some of these questions together into a data-frame. 我为一些调查问题生成一些频率,然后将其中一些问题放到一个数据框中。 Each question has a Yes/No response, that is also reported as No % and Yes % . 每个问题的回答为“是/否”,也报告为“ No %和“ Yes %

Now, if in a given row No < 15 OR Yes < 15 then only the Total value should be visible in that row while the No, Yes, No % and Yes % columns are masked as NA . 现在,如果在给定的行No < 15Yes < 15则只有Total值在该行中可见,而No,Yes, No %Yes %列被掩码为NA

I am messing about with case_when and other options but having little luck. 我搞砸了case_when和其他选项,但是运气不好。 I'll plug away but if an obvious solution strikes someone, I'd be grateful. 我会插手,但如果有一个明显的解决方案能使某人满意,我将不胜感激。 I am not wedded to dplyr for the solution. 我不愿意向dplyr寻求解决方案。 thanks in advance! 提前致谢!

The example data frame is shown below as mytab : 示例数据框显示为mytab

mytab <- structure(list(No = c(271L, 1395L, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L), Yes = c(1138L, 14L, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
               Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
               `No (%)` = c(19.2334989354152, 99.0063875088715, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
               `Yes (%)` = c(80.7665010645848, 0.99361249112846, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
          row.names = c(NA, -9L),
          class = "data.frame")

mytab

#>     No  Yes Total   No (%)    Yes (%)
#> 1  271 1138  1409 19.23350 80.7665011
#> 2 1395   14  1409 99.00639  0.9936125
#> 3 1393   16  1409 98.86444  1.1355571
#> 4 1338   71  1409 94.96097  5.0390348
#> 5 1254  155  1409 88.99929 11.0007097
#> 6 1355   54  1409 96.16749  3.8325053
#> 7 1332   77  1409 94.53513  5.4648687
#> 8 1380   29  1409 97.94180  2.0581973
#> 9 1360   49  1409 96.52236  3.4776437

The solution should yield mytab2 that can then be piped to knitr . 该解决方案应该产生mytab2 ,然后可以将其通过管道传输到knitr

mytab2 <- structure(list(No = c(271L, NA, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L),
                         Yes = c(1138L, NA, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
                         Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
                         `No (%)` = c(19.2334989354152, NA, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
                         `Yes (%)` = c(80.7665010645848, NA, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
                    row.names = c(NA, -9L),
                    class = "data.frame")

mytab2
#>     No  Yes Total   No (%)   Yes (%)
#> 1  271 1138  1409 19.23350 80.766501
#> 2   NA   NA  1409       NA        NA
#> 3 1393   16  1409 98.86444  1.135557
#> 4 1338   71  1409 94.96097  5.039035
#> 5 1254  155  1409 88.99929 11.000710
#> 6 1355   54  1409 96.16749  3.832505
#> 7 1332   77  1409 94.53513  5.464869
#> 8 1380   29  1409 97.94180  2.058197
#> 9 1360   49  1409 96.52236  3.477644

This is the same as divibisan's answer, but with data.table syntax that reduces repetition of the table name and uses between (since it seems to fit): 这与divibisan的答案相同,但使用data.table语法可减少表名的重复和between使用(因为它似乎合适):

library(data.table)
mybadtab = data.table(mytab)

mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")
mybadtab[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA]

     No  Yes Total   No (%)   Yes (%)
1:  271 1138  1409 19.23350 80.766501
2:   NA   NA  1409       NA        NA
3: 1393   16  1409 98.86444  1.135557
4: 1338   71  1409 94.96097  5.039035
5: 1254  155  1409 88.99929 11.000710
6: 1355   54  1409 96.16749  3.832505
7: 1332   77  1409 94.53513  5.464869
8: 1380   29  1409 97.94180  2.058197
9: 1360   49  1409 96.52236  3.477644

In pipe form... 管道形式...

library(magrittr)
library(knitr)

mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")

data.table(mytab)[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA] %>% 
  kable

|   No|  Yes| Total|   No (%)|   Yes (%)|
|----:|----:|-----:|--------:|---------:|
|  271| 1138|  1409| 19.23350| 80.766501|
|   NA|   NA|  1409|       NA|        NA|
| 1393|   16|  1409| 98.86444|  1.135557|
| 1338|   71|  1409| 94.96097|  5.039035|
| 1254|  155|  1409| 88.99929| 11.000710|
| 1355|   54|  1409| 96.16749|  3.832505|
| 1332|   77|  1409| 94.53513|  5.464869|
| 1380|   29|  1409| 97.94180|  2.058197|
| 1360|   49|  1409| 96.52236|  3.477644|

In base R, you can just subset with square brackets to get the appropriate rows and then assign NA to the columns you want to change. 在基数R中,您可以仅使用带有方括号的子集来获取适当的行,然后将NA分配给要更改的列。 NOTE: this will modify the value of mytab . 注意:这将修改 mytab的值。 If you want to make the changes in a new data.frame, you need to copy mytab and modify the copy: 如果要在新的data.frame中进行更改,则需要复制mytab并修改副本:

mytab2 <- mytab
mytab2[mytab2$No < 15 | mytab2$Yes < 15, c('No', 'Yes', 'No (%)', 'Yes (%)')] <- NA
mytab2
    No  Yes Total   No (%)   Yes (%)
1  271 1138  1409 19.23350 80.766501
2   NA   NA  1409       NA        NA
3 1393   16  1409 98.86444  1.135557
4 1338   71  1409 94.96097  5.039035
5 1254  155  1409 88.99929 11.000710
6 1355   54  1409 96.16749  3.832505
7 1332   77  1409 94.53513  5.464869
8 1380   29  1409 97.94180  2.058197
9 1360   49  1409 96.52236  3.477644

Try this: 尝试这个:

df<-as.data.frame(list(No = c(271, 1395, 1393, 1338, 1254, 1355, 1332, 1380, 1360),
                   Yes = c(1138, 14, 16, 71, 155, 54, 77, 29, 49),
                   Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409)))

df$NoPct<-0
df$YesPct<-0

rowcalc<-function(x){
  if (x[1]<15 | x[2]<15){
      x[1]= x[2]= x[4]=x[5]=NA
  } else {
         x[4]<- round(100*x[1]/x[3],digits=2) #rounding to 2 decimal places 
         x[5]<- round(100*x[2]/x[3],digits=2) 
          }
        return(x)
        }
t(apply(df,1,rowcalc)) #apply rowcalc to every row & transpose it  

#      No  Yes Total NoPct YesPct
#[1,]  271 1138  1409 19.23  80.77
#[2,]   NA   NA  1409    NA     NA
#[3,] 1393   16  1409 98.86   1.14
#[4,] 1338   71  1409 94.96   5.04
#[5,] 1254  155  1409 89.00  11.00
#[6,] 1355   54  1409 96.17   3.83
#[7,] 1332   77  1409 94.54   5.46
#[8,] 1380   29  1409 97.94   2.06
#[9,] 1360   49  1409 96.52   3.48

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM