简体   繁体   中英

masking specific cell values in rows that meet condition

I am generating some frequencies for some survey questions and then putting some of these questions together into a data-frame. Each question has a Yes/No response, that is also reported as No % and Yes % .

Now, if in a given row No < 15 OR Yes < 15 then only the Total value should be visible in that row while the No, Yes, No % and Yes % columns are masked as NA .

I am messing about with case_when and other options but having little luck. I'll plug away but if an obvious solution strikes someone, I'd be grateful. I am not wedded to dplyr for the solution. thanks in advance!

The example data frame is shown below as mytab :

mytab <- structure(list(No = c(271L, 1395L, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L), Yes = c(1138L, 14L, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
               Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
               `No (%)` = c(19.2334989354152, 99.0063875088715, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
               `Yes (%)` = c(80.7665010645848, 0.99361249112846, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
          row.names = c(NA, -9L),
          class = "data.frame")

mytab

#>     No  Yes Total   No (%)    Yes (%)
#> 1  271 1138  1409 19.23350 80.7665011
#> 2 1395   14  1409 99.00639  0.9936125
#> 3 1393   16  1409 98.86444  1.1355571
#> 4 1338   71  1409 94.96097  5.0390348
#> 5 1254  155  1409 88.99929 11.0007097
#> 6 1355   54  1409 96.16749  3.8325053
#> 7 1332   77  1409 94.53513  5.4648687
#> 8 1380   29  1409 97.94180  2.0581973
#> 9 1360   49  1409 96.52236  3.4776437

The solution should yield mytab2 that can then be piped to knitr .

mytab2 <- structure(list(No = c(271L, NA, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L),
                         Yes = c(1138L, NA, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
                         Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
                         `No (%)` = c(19.2334989354152, NA, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
                         `Yes (%)` = c(80.7665010645848, NA, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
                    row.names = c(NA, -9L),
                    class = "data.frame")

mytab2
#>     No  Yes Total   No (%)   Yes (%)
#> 1  271 1138  1409 19.23350 80.766501
#> 2   NA   NA  1409       NA        NA
#> 3 1393   16  1409 98.86444  1.135557
#> 4 1338   71  1409 94.96097  5.039035
#> 5 1254  155  1409 88.99929 11.000710
#> 6 1355   54  1409 96.16749  3.832505
#> 7 1332   77  1409 94.53513  5.464869
#> 8 1380   29  1409 97.94180  2.058197
#> 9 1360   49  1409 96.52236  3.477644

This is the same as divibisan's answer, but with data.table syntax that reduces repetition of the table name and uses between (since it seems to fit):

library(data.table)
mybadtab = data.table(mytab)

mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")
mybadtab[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA]

     No  Yes Total   No (%)   Yes (%)
1:  271 1138  1409 19.23350 80.766501
2:   NA   NA  1409       NA        NA
3: 1393   16  1409 98.86444  1.135557
4: 1338   71  1409 94.96097  5.039035
5: 1254  155  1409 88.99929 11.000710
6: 1355   54  1409 96.16749  3.832505
7: 1332   77  1409 94.53513  5.464869
8: 1380   29  1409 97.94180  2.058197
9: 1360   49  1409 96.52236  3.477644

In pipe form...

library(magrittr)
library(knitr)

mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")

data.table(mytab)[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA] %>% 
  kable

|   No|  Yes| Total|   No (%)|   Yes (%)|
|----:|----:|-----:|--------:|---------:|
|  271| 1138|  1409| 19.23350| 80.766501|
|   NA|   NA|  1409|       NA|        NA|
| 1393|   16|  1409| 98.86444|  1.135557|
| 1338|   71|  1409| 94.96097|  5.039035|
| 1254|  155|  1409| 88.99929| 11.000710|
| 1355|   54|  1409| 96.16749|  3.832505|
| 1332|   77|  1409| 94.53513|  5.464869|
| 1380|   29|  1409| 97.94180|  2.058197|
| 1360|   49|  1409| 96.52236|  3.477644|

In base R, you can just subset with square brackets to get the appropriate rows and then assign NA to the columns you want to change. NOTE: this will modify the value of mytab . If you want to make the changes in a new data.frame, you need to copy mytab and modify the copy:

mytab2 <- mytab
mytab2[mytab2$No < 15 | mytab2$Yes < 15, c('No', 'Yes', 'No (%)', 'Yes (%)')] <- NA
mytab2
    No  Yes Total   No (%)   Yes (%)
1  271 1138  1409 19.23350 80.766501
2   NA   NA  1409       NA        NA
3 1393   16  1409 98.86444  1.135557
4 1338   71  1409 94.96097  5.039035
5 1254  155  1409 88.99929 11.000710
6 1355   54  1409 96.16749  3.832505
7 1332   77  1409 94.53513  5.464869
8 1380   29  1409 97.94180  2.058197
9 1360   49  1409 96.52236  3.477644

Try this:

df<-as.data.frame(list(No = c(271, 1395, 1393, 1338, 1254, 1355, 1332, 1380, 1360),
                   Yes = c(1138, 14, 16, 71, 155, 54, 77, 29, 49),
                   Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409)))

df$NoPct<-0
df$YesPct<-0

rowcalc<-function(x){
  if (x[1]<15 | x[2]<15){
      x[1]= x[2]= x[4]=x[5]=NA
  } else {
         x[4]<- round(100*x[1]/x[3],digits=2) #rounding to 2 decimal places 
         x[5]<- round(100*x[2]/x[3],digits=2) 
          }
        return(x)
        }
t(apply(df,1,rowcalc)) #apply rowcalc to every row & transpose it  

#      No  Yes Total NoPct YesPct
#[1,]  271 1138  1409 19.23  80.77
#[2,]   NA   NA  1409    NA     NA
#[3,] 1393   16  1409 98.86   1.14
#[4,] 1338   71  1409 94.96   5.04
#[5,] 1254  155  1409 89.00  11.00
#[6,] 1355   54  1409 96.17   3.83
#[7,] 1332   77  1409 94.54   5.46
#[8,] 1380   29  1409 97.94   2.06
#[9,] 1360   49  1409 96.52   3.48

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM