[英]masking specific cell values in rows that meet condition
I am generating some frequencies for some survey questions and then putting some of these questions together into a data-frame. 我为一些调查问题生成一些频率,然后将其中一些问题放到一个数据框中。 Each question has a Yes/No response, that is also reported as
No %
and Yes %
. 每个问题的回答为“是/否”,也报告为“
No %
和“ Yes %
。
Now, if in a given row No < 15
OR Yes < 15
then only the Total value should be visible in that row while the No, Yes, No %
and Yes %
columns are masked as NA
. 现在,如果在给定的行
No < 15
或Yes < 15
则只有Total值在该行中可见,而No,Yes, No %
和Yes %
列被掩码为NA
。
I am messing about with case_when
and other options but having little luck. 我搞砸了
case_when
和其他选项,但是运气不好。 I'll plug away but if an obvious solution strikes someone, I'd be grateful. 我会插手,但如果有一个明显的解决方案能使某人满意,我将不胜感激。 I am not wedded to
dplyr
for the solution. 我不愿意向
dplyr
寻求解决方案。 thanks in advance! 提前致谢!
The example data frame is shown below as mytab
: 示例数据框显示为
mytab
:
mytab <- structure(list(No = c(271L, 1395L, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L), Yes = c(1138L, 14L, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
`No (%)` = c(19.2334989354152, 99.0063875088715, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
`Yes (%)` = c(80.7665010645848, 0.99361249112846, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
row.names = c(NA, -9L),
class = "data.frame")
mytab
#> No Yes Total No (%) Yes (%)
#> 1 271 1138 1409 19.23350 80.7665011
#> 2 1395 14 1409 99.00639 0.9936125
#> 3 1393 16 1409 98.86444 1.1355571
#> 4 1338 71 1409 94.96097 5.0390348
#> 5 1254 155 1409 88.99929 11.0007097
#> 6 1355 54 1409 96.16749 3.8325053
#> 7 1332 77 1409 94.53513 5.4648687
#> 8 1380 29 1409 97.94180 2.0581973
#> 9 1360 49 1409 96.52236 3.4776437
The solution should yield mytab2
that can then be piped to knitr
. 该解决方案应该产生
mytab2
,然后可以将其通过管道传输到knitr
。
mytab2 <- structure(list(No = c(271L, NA, 1393L, 1338L, 1254L, 1355L, 1332L, 1380L, 1360L),
Yes = c(1138L, NA, 16L, 71L, 155L, 54L, 77L, 29L, 49L),
Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409),
`No (%)` = c(19.2334989354152, NA, 98.8644428672818, 94.9609652235628, 88.9992902767921, 96.1674946770759, 94.5351312987935, 97.9418026969482, 96.5223562810504),
`Yes (%)` = c(80.7665010645848, NA, 1.13555713271824, 5.03903477643719, 11.0007097232079, 3.83250532292406, 5.46486870120653, 2.05819730305181, 3.47764371894961)),
row.names = c(NA, -9L),
class = "data.frame")
mytab2
#> No Yes Total No (%) Yes (%)
#> 1 271 1138 1409 19.23350 80.766501
#> 2 NA NA 1409 NA NA
#> 3 1393 16 1409 98.86444 1.135557
#> 4 1338 71 1409 94.96097 5.039035
#> 5 1254 155 1409 88.99929 11.000710
#> 6 1355 54 1409 96.16749 3.832505
#> 7 1332 77 1409 94.53513 5.464869
#> 8 1380 29 1409 97.94180 2.058197
#> 9 1360 49 1409 96.52236 3.477644
This is the same as divibisan's answer, but with data.table syntax that reduces repetition of the table name and uses between
(since it seems to fit): 这与divibisan的答案相同,但使用data.table语法可减少表名的重复和
between
使用(因为它似乎合适):
library(data.table)
mybadtab = data.table(mytab)
mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")
mybadtab[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA]
No Yes Total No (%) Yes (%)
1: 271 1138 1409 19.23350 80.766501
2: NA NA 1409 NA NA
3: 1393 16 1409 98.86444 1.135557
4: 1338 71 1409 94.96097 5.039035
5: 1254 155 1409 88.99929 11.000710
6: 1355 54 1409 96.16749 3.832505
7: 1332 77 1409 94.53513 5.464869
8: 1380 29 1409 97.94180 2.058197
9: 1360 49 1409 96.52236 3.477644
In pipe form... 管道形式...
library(magrittr)
library(knitr)
mymin = 15
badcols = c("No", "Yes", "No (%)", "Yes (%)")
data.table(mytab)[!( No %between% c(mymin, Total - mymin) ), (badcols) := NA] %>%
kable
| No| Yes| Total| No (%)| Yes (%)|
|----:|----:|-----:|--------:|---------:|
| 271| 1138| 1409| 19.23350| 80.766501|
| NA| NA| 1409| NA| NA|
| 1393| 16| 1409| 98.86444| 1.135557|
| 1338| 71| 1409| 94.96097| 5.039035|
| 1254| 155| 1409| 88.99929| 11.000710|
| 1355| 54| 1409| 96.16749| 3.832505|
| 1332| 77| 1409| 94.53513| 5.464869|
| 1380| 29| 1409| 97.94180| 2.058197|
| 1360| 49| 1409| 96.52236| 3.477644|
In base R, you can just subset with square brackets to get the appropriate rows and then assign NA
to the columns you want to change. 在基数R中,您可以仅使用带有方括号的子集来获取适当的行,然后将
NA
分配给要更改的列。 NOTE: this will modify the value of mytab
. 注意:这将修改
mytab
的值。 If you want to make the changes in a new data.frame, you need to copy mytab
and modify the copy: 如果要在新的data.frame中进行更改,则需要复制
mytab
并修改副本:
mytab2 <- mytab
mytab2[mytab2$No < 15 | mytab2$Yes < 15, c('No', 'Yes', 'No (%)', 'Yes (%)')] <- NA
mytab2
No Yes Total No (%) Yes (%)
1 271 1138 1409 19.23350 80.766501
2 NA NA 1409 NA NA
3 1393 16 1409 98.86444 1.135557
4 1338 71 1409 94.96097 5.039035
5 1254 155 1409 88.99929 11.000710
6 1355 54 1409 96.16749 3.832505
7 1332 77 1409 94.53513 5.464869
8 1380 29 1409 97.94180 2.058197
9 1360 49 1409 96.52236 3.477644
Try this: 尝试这个:
df<-as.data.frame(list(No = c(271, 1395, 1393, 1338, 1254, 1355, 1332, 1380, 1360),
Yes = c(1138, 14, 16, 71, 155, 54, 77, 29, 49),
Total = c(1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409)))
df$NoPct<-0
df$YesPct<-0
rowcalc<-function(x){
if (x[1]<15 | x[2]<15){
x[1]= x[2]= x[4]=x[5]=NA
} else {
x[4]<- round(100*x[1]/x[3],digits=2) #rounding to 2 decimal places
x[5]<- round(100*x[2]/x[3],digits=2)
}
return(x)
}
t(apply(df,1,rowcalc)) #apply rowcalc to every row & transpose it
# No Yes Total NoPct YesPct
#[1,] 271 1138 1409 19.23 80.77
#[2,] NA NA 1409 NA NA
#[3,] 1393 16 1409 98.86 1.14
#[4,] 1338 71 1409 94.96 5.04
#[5,] 1254 155 1409 89.00 11.00
#[6,] 1355 54 1409 96.17 3.83
#[7,] 1332 77 1409 94.54 5.46
#[8,] 1380 29 1409 97.94 2.06
#[9,] 1360 49 1409 96.52 3.48
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.