简体   繁体   English

R mutate_at 在行子集上

[英]R mutate_at on a subset of rows

My question is similar to this post( Applying mutate_at conditionally to specific rows in a dataframe in R ), and I could reproduce the result.我的问题与这篇文章类似( 将 mutate_at 有条件地应用于 R 中的 dataframe 中的特定行),我可以重现结果。 But whey I tried to apply this to my problem, which is putting parenthesis to the cell value for selected rows and columns, I run into error messages.但是,当我尝试将其应用于我的问题时,即在选定行和列的单元格值上加上括号时,我遇到了错误消息。 Here's a reproducible example.这是一个可重现的示例。

df <- structure(list(dep = c("cyl", "cyl", "disp", "disp", "drat", 
"drat", "hp", "hp", "mpg", "mpg"), name = c("estimate", "t_stat", 
"estimate", "t_stat", "estimate", "t_stat", "estimate", "t_stat", 
"estimate", "t_stat"), dat1 = c(1.151, 6.686, 102.902, 12.107, 
-0.422, -5.237, 37.576, 5.067, -5.057, -8.185), dat2 = c(1.274, 
8.423, 106.429, 12.148, -0.394, -5.304, 38.643, 6.172, -4.843, 
-10.622), dat3 = c(1.078, 5.191, 103.687, 7.79, -0.194, -2.629, 
36.777, 4.842, -4.539, -7.91)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))  

Given above data frame, I hope to put parenthesis to the cell values of column dat1 , dat2 and dat3 when name == t_stat .鉴于上述数据框,我希望在name == t_stat时将括号括在dat1dat2dat3列的单元格值中。 Here's what I've tried, but it seems like that paste0 is not accepted inside of the case_when function in this case.这是我尝试过的,但在这种情况下, case_when paste0似乎不接受 paste0 。

require(tidyverse)
df %>% mutate_at(vars(matches("dat")), 
+                  funs( case_when(name == 't_stat' ~ paste0("(", ., ")"), TRUE ~ .) )) 
Error: must be a character vector, not a double vector

When I use brute force, namely mutate each column, then it works but my actual problem has more than 10 columns so this is not really practical.当我使用蛮力,即改变每一列时,它可以工作,但我的实际问题有超过 10 列,所以这不是很实用。

require(tidyverse)
> df %>%   mutate(dat1 = ifelse(name == "t_stat", paste0("(", dat1, ")"), dat1),
+                 dat2 = ifelse(name == "t_stat", paste0("(", dat2, ")"), dat1),
+                 dat3 = ifelse(name == "t_stat", paste0("(", dat3, ")"), dat1))
# A tibble: 10 x 5
   dep   name     dat1     dat2      dat3    
   <chr> <chr>    <chr>    <chr>     <chr>   
 1 cyl   estimate 1.151    1.151     1.151   
 2 cyl   t_stat   (6.686)  (8.423)   (5.191) 
 3 disp  estimate 102.902  102.902   102.902 
 4 disp  t_stat   (12.107) (12.148)  (7.79)  
 5 drat  estimate -0.422   -0.422    -0.422  
 6 drat  t_stat   (-5.237) (-5.304)  (-2.629)
 7 hp    estimate 37.576   37.576    37.576  
 8 hp    t_stat   (5.067)  (6.172)   (4.842) 
 9 mpg   estimate -5.057   -5.057    -5.057  
10 mpg   t_stat   (-8.185) (-10.622) (-7.91)

Basically, you need to convert dbl to char first, and that is what the error message is also saying Error: must be a character vector, not a double vector基本上,您需要先将dbl转换为char ,这就是错误消息中所说的Error: must be a character vector, not a double vector

As @Rohan rightly said, case_when is type-strict meaning it expects output to be of same class.正如@Rohan 所说, case_when 是类型严格的,这意味着它期望 output 与 class 相同。

df %>% mutate_at(vars(matches("dat")),
                 ~case_when(name =='t_stat'~ paste0("(",as.character(.x),")"),
                            T ~ as.character(.x))
                 )

output as output 为

# A tibble: 10 x 5
   dep   name     dat1     dat2      dat3    
   <chr> <chr>    <chr>    <chr>     <chr>   
 1 cyl   estimate 1.151    1.274     1.078   
 2 cyl   t_stat   (6.686)  (8.423)   (5.191) 
 3 disp  estimate 102.902  106.429   103.687 
 4 disp  t_stat   (12.107) (12.148)  (7.79)  
 5 drat  estimate -0.422   -0.394    -0.194  
 6 drat  t_stat   (-5.237) (-5.304)  (-2.629)
 7 hp    estimate 37.576   38.643    36.777  
 8 hp    t_stat   (5.067)  (6.172)   (4.842) 
 9 mpg   estimate -5.057   -4.843    -4.539  
10 mpg   t_stat   (-8.185) (-10.622) (-7.91) 

The error message is... unhelpful.错误消息是......没有帮助。

Your problem is that you're mixing numeric and character data in a column.您的问题是您在列中混合了数字和字符数据。 The dat variables are numeric. dat变量是数字的。

df %>% mutate_at(vars(matches("dat")), 
                 funs( case_when(name == 't_stat' ~ paste0("(", ., ")"),
                                 TRUE ~ as.character(.))))

# A tibble: 10 x 5
   dep   name     dat1     dat2      dat3    
   <chr> <chr>    <chr>    <chr>     <chr>   
 1 cyl   estimate 1.151    1.274     1.078   
 2 cyl   t_stat   (6.686)  (8.423)   (5.191) 
 3 disp  estimate 102.902  106.429   103.687 
 4 disp  t_stat   (12.107) (12.148)  (7.79)  
 5 drat  estimate -0.422   -0.394    -0.194  
 6 drat  t_stat   (-5.237) (-5.304)  (-2.629)
 7 hp    estimate 37.576   38.643    36.777  
 8 hp    t_stat   (5.067)  (6.172)   (4.842) 
 9 mpg   estimate -5.057   -4.843    -4.539  
10 mpg   t_stat   (-8.185) (-10.622) (-7.91) 

case_when is type-strict meaning it expects output to be of same class. case_when是类型严格的,这意味着它期望 output 与 class 相同。 Your original columns are of type numeric whereas while adding "(" around your data you are making it of class character.您的原始列是数字类型,而在您的数据周围添加"("时,您正在使其成为 class 字符。

Also funs is long deprecated and mutate_at will soon be replaced with across . funs也早已被弃用, mutate_at很快就会被 cross 取代across

library(dplyr)
df %>% 
    mutate_at(vars(matches("dat")), 
      ~case_when(name == 't_stat' ~ paste0("(", ., ")"), TRUE ~ as.character(.)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM