如何将 R dataframe 中的列子集中的数值更改为其他数值？

Question

Disclaimer: I am an R newbie and thus some information I provide might be redundant.免责声明：我是 R 新手，因此我提供的一些信息可能是多余的。 But after 2 hours of failed attempts at such a seemingly easy endeavour, I deemed it appropriate to ask a question in this forum.但是在如此看似简单的尝试中尝试了 2 小时失败后，我认为在这个论坛上提问是合适的。

So I have a datatset with currently 4 rows /subjects (more to come as this is ongoing research) and 259 variables /columns.所以我有一个数据集，目前有 4 行/主题（更多，因为这是正在进行的研究）和 259 个变量/列。 240 variables of this dataset are ratings of fit ("How well does the following adjective match the dimension X?" and 19 variables are sociodemographic.该数据集的 240 个变量是拟合评级（“以下形容词与维度 X 的匹配程度如何？”和 19 个变量是社会人口学的。

For these 240 rating-variables, my subjects could give a rating ranging from 1 ("fits very badly") to 7 ("fits very well").对于这 240 个评分变量，我的受试者可以给出从 1（“非常不适合”）到 7（“非常适合”）的评分。 Consequently, I have a 240 variables numbered from 1 to 7. I would like to change these numeric values as follows (the procedure being the same for all of the 240 colums)因此，我有一个从 1 到 7 编号的 240 个变量。我想按如下方式更改这些数值（所有 240 列的过程都相同）

1 should change to 0, 2 should change to 1/6, 3 should change to 2/6, 4 should change to 3/6, 5 should change to 4/6, 6 should change to 5/6 and 7 should change to 1. So no matter where in the 240 columns, a 1 should change to 0 and so on. 1 应更改为 0，2 应更改为 1/6，3 应更改为 2/6，4 应更改为 3/6，5 应更改为 4/6，6 应更改为 5/6，7 应更改为1. 所以无论在 240 列中的哪个位置，一个 1 都应该变为 0，以此类推。

I have tried the following approaches:我尝试了以下方法：

Recode numeric values in R 重新编码 R 中的数值

In this post, it says that在这篇文章中，它说

x <- 1:10

# With recode function using backquotes as arguments
dplyr::recode(x, `2` = 20L, `4` = 40L)
# [1]  1 20  3 40  5  6  7  8  9 10

# With case_when function
dplyr::case_when(
  x %in% 2 ~ 20,
  x %in% 4 ~ 40,
  TRUE ~ as.numeric(x)
)
#  [1]  1 20  3 40  5  6  7  8  9 10

Consequently, I tried this:因此，我尝试了这个：

df = ds %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20,                      AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20)
                   %>% recode(.,`1`=0,`2`=-1/6,`3`=-2/6, `4`=3/6,`5`=4/6, `6`=5/6, `7`=1))

with AD01_01 etc. being the column names for the adjectives my subjects should rate. AD01_01 等是我的受试者应该评价的形容词的列名。 I also tried it without the ".," after recode(, to no avail.在重新编码（，无济于事）之后，我也尝试过没有“。”。

This code is flawed because it omits the 19 rows of sociodemographic data I want to keep in my dataset.这段代码有缺陷，因为它遗漏了我想保存在数据集中的 19 行社会人口数据。 Moreover, I get the error "unexpected SPECIAL in " %>%". I thought R might accept my selected columns with the pipe operator as the "x" in the recode function. Apparently, this is not the case. I also tried to read up on the R documentation of the recode function but it made things much more confusing for me, as there were a lot of technical terms I don't understand.此外，我会收到“％>％”的错误。我以为Ze1e1d3d40573127E9EE0480480CAF1283D6Z可能会接受我所选的列，并使用Z20826A3CB51D6C7D9C219C219C7F4BF4BF4BF4E5C9199393939393939393636.ROUCE19999993939939 XC.阅读重新编码 function 的 R 文档，但这让我更加困惑，因为有很多我不明白的技术术语。

As there is another option mentioned in the post, I also tried this:由于帖子中提到了另一个选项，我也尝试了这个：

df = df %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20,                     AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20) %>% case_when (.,%in% 1~0,%in% 2~1/6,%in%3~2/6,%in%4~3/6,%in%5~4/6,%in%6~5/6,%in%7~1)

I thought I could give the output of the select function to the case_when function.我想我可以把 select function 的 output 给 case_when ZC1C42145268E617A474D。 Apparently, this is also not the case.显然，情况也并非如此。

When I execute this command, I get当我执行这个命令时，我得到

Error: unexpected SPECIAL in:
"df = df %>% select(AD01_01:AD01_20,AD02_01:AD02_20,AD03_01:AD03_20,AD04_01:AD04_20,AD05_01:AD05_20,AD06_01:AD06_20,                      AD09_01:AD09_20,AD10_01:AD10_20,AD11_01:AD11_20,AD12_01:AD12_20,AD13_01:AD13_20,AD14_01:AD14_20) %>% case_when (%in%"

Reading up on other possibilities, I found this阅读其他可能性，我发现了这个

https://rstudio-education.github.io/hopr/modify.html https://rstudio-education.github.io/hopr/modify.html

exemplary dataset:示例数据集：

head(dplyr::storms)头（dplyr::storms）

## # A tibble: 6 x 13
##   name   year month   day  hour   lat  long status category  wind pressure
##   <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr>  <ord>    <int>    <int>
## 1 Amy    1975     6    27     0  27.5 -79   tropi… -1          25     1013
## 2 Amy    1975     6    27     6  28.5 -79   tropi… -1          25     1013
## 3 Amy    1975     6    27    12  29.5 -79   tropi… -1          25     1013
## 4 Amy    1975     6    27    18  30.5 -79   tropi… -1          25     1013
## 5 Amy    1975     6    28     0  31.5 -78.8 tropi… -1          25     1012
## 6 Amy    1975     6    28     6  32.4 -78.7 tropi… -1          25     1012
## # ... with 2 more variables: ts_diameter <dbl>, hu_diameter <dbl>

We decide that we want to recode all NAs to 9999.

storm <- storms

storm$ts_diameter[is.na(storm$ts_diameter)] <- 9999
summary(storm$ts_diameter)

ds$AD01_01:AD01_20[1(ds$AD01_01:AD01_20)] <- 0, ds$AD01_01:AD01_20[2(ds$AD01_01:AD01_20)] <- 1/6, ds$AD01_01:AD01_20[3(ds$AD01_01:AD01_20)] <- 2/6, 
ds$AD01_01:AD01_20[4(ds$AD01_01:AD01_20)] <- 3/6, ds$AD01_01:AD01_20[5(ds$AD01_01:AD01_20)] <- 4/6, ds$AD01_01:AD01_20[6(ds$AD01_01:AD01_20)] <- 5/6, 
ds$AD01_01:AD01_20[7(ds$AD01_01:AD01_20)] <- 1

My idea in this case was to use the "assign"-Function for multiple columns at a time (this effort just concerns 20 of my 240 columns and it also didn't work. I got the error "could not find function ":<-" which is weird because I thought this was a basic command. The only noteworthy thing that might explain is that I executed "library(readr) and library(tidyverse)" beforehand.在这种情况下，我的想法是一次对多列使用“分配”功能（这项工作只涉及我的 240 列中的 20 列，而且它也不起作用。我收到错误“找不到 function”：< -”这很奇怪，因为我认为这是一个基本命令。唯一值得注意的可能是我事先执行了“library(readr) and library(tidyverse)”。

After 2 hours, I finally give up. 2个小时后，我终于放弃了。 I would appreciate it if you found the time to help me.如果您有时间帮助我，我将不胜感激。 I would also like to know where I went wrong and why my code doesn't work (or alternatively please explain why your code works).我还想知道我哪里出错了，为什么我的代码不起作用（或者请解释为什么你的代码起作用）。

Answer 1

How about using mutate(across()) ?如何使用mutate(across()) ？ For example, if all your "adjective rating" columns start with "AD", you can do something like this:例如，如果您所有的“形容词评分”列都以“AD”开头，您可以执行以下操作：

library(dplyr)
ds %>% mutate(across(starts_with("AD"), ~(.x-1)/6))

Explanation of where you went wrong with your code:解释你的代码哪里出错了：

First, your select(...) %>% recode(...) was close.首先，您的select(...) %>% recode(...)很接近。 However, when you use select , you are reducing ds to only the selected columns, thus recoding those values and assigning to df will result in df not having the demographic variables.但是，当您使用select时，您将ds减少到仅选定的列，因此重新编码这些值并分配给df将导致df没有人口统计变量。

Second, if you want to use recode you can, but you can't feed it an entire data frame/tibble, like you are doing when you pipe ( %>% ) the selected columns to it.其次，如果你想使用recode ，你可以，但你不能像你在 pipe ( %>% ) 选择的列给它时那样提供整个数据框/小标题。 Instead, you can use recode() iteratively in .fns , on each of the columns in the .cols param of across() , like this:相反，您可以在.fns中迭代地使用recode() ，在 cross across() ) 的.cols参数中的每一列上，如下所示：

ds %>%
  mutate(across(
    .cols = starts_with("AD"),
    .fns = ~recode(.x,`1`=0,`2`=-1/6,`3`=-2/6, `4`=3/6,`5`=4/6, `6`=5/6, `7`=1))
  )

如何将 R dataframe 中的列子集中的数值更改为其他数值？

问题描述

1 个解决方案

解决方案1
1 2022-08-10 15:03:57

Explanation of where you went wrong with your code:解释你的代码哪里出错了：

如何将 R dataframe 中的列子集中的数值更改为其他数值？

问题描述

1 个解决方案

解决方案1 1 2022-08-10 15:03:57

Explanation of where you went wrong with your code:解释你的代码哪里出错了：

解决方案1
1 2022-08-10 15:03:57