简体   繁体   English

(R) 如何根据 R 中的另一列和 ID 从一列复制粘贴值

[英](R) How to copy paste values from one column based on another column and ID in R

For simplicity reasons, let's assume I have two columns.为简单起见,假设我有两列。 First: ID (string of codes such as AA23, BA53, NA , etc.) Second: Age (18, 32, 55, 23, etc.)第一个:ID(一串代码,如 AA23、BA53、 NA等) 第二个:年龄(18、32、55、23 等)

And IDs sometimes repeat (ie, one person - AA23 filled the survey in two days, but only on the first day was asked how old he is, but during the second and third day not).并且 ID 有时会重复(即,一个人 - AA23 在两天内填写了调查表,但仅在第一天被问到他的年龄,但在第二天和第三天没有)。

I want to copy paste values from the Age column based on the ID, so that I have a 'long format' of the dataframe.我想根据 ID 从 Age 列中复制粘贴值,这样我就有了 dataframe 的“长格式”。

dput(data):输入(数据):

structure(list(Code = c("MW68", "AW80", "EW40", "BW60", "Wn36", 
"ZK45", "SI55", "MW68", "EW40", "DC06", NA, "IW28"), Age = c("52", 
"26", "34", "26", "20", "35", NA, NA, NA, NA, NA, NA)), row.names = c(5L, 
6L, 7L, 8L, 9L, 10L, 400L, 401L, 402L, 403L, 404L, 405L), class = "data.frame")

Input:

ID   Age
AA23 18
BA53 32
AC13 55
AA23 NA
BA53 NA  
AC13 NA
NA   23
AA23 NA
(the trick is that sometimes ID is NA)

And the desired output:
ID   Age
AA23 18
BA53 32
AC13 55
AA23 18
BA53 32  
AC13 55
NA   23
AA23 18

Thank you in advance!先感谢您!

I'm not quite sure if I understood correctly what you want to do, but this code here should look where Age is NA and fill in the mean of the Age from the other rows with the same entry in Code .我不太确定我是否正确理解了您想要做什么,但是这里的代码应该查看AgeNA的位置,并使用Code中的相同条目从其他行填写Age的平均值。 Obviously, this will fail if there are values for Code where no Age value exists anywhere in the table.显然,如果代码的值在表中的任何地方都不存在Age值,这将失败。 If there are various values for Age in different rows with the same Code , it will fill in the mean in this example, since you didn't specify what to do in such a case.如果在具有相同Code的不同行中有不同的Age值,它将在此示例中填充平均值,因为您没有指定在这种情况下要做什么。

for(i in 1:nrow(data)){
  if(!is.na(data$Code[i])){
    if(is.na(data$Age[i])){
      data$Age[i] <- mean(data$Age[data$Code == data$Code[i]], na.rm = TRUE)
    }
  }
}

This skips rows with NA in the Code column.这会跳过代码列中带有NA的行。

You can also use the function coalesce which finds the first NA value and replace it with the value you define, here we would like it to be the first value of every Age variable (grouping variable):您还可以使用 function coalesce找到第一个NA值并将其替换为您定义的值,这里我们希望它是每个Age变量(分组变量)的第一个值:

library(dplyr)

df %>%
  group_by(Code) %>%
  mutate(across(Age, ~ coalesce(.x, first(.x))))

# A tibble: 12 x 2
# Groups:   Code [10]
   Code  Age  
   <chr> <chr>
 1 MW68  52   
 2 AW80  26   
 3 EW40  34   
 4 BW60  26   
 5 Wn36  20   
 6 ZK45  35   
 7 SI55  NA   
 8 MW68  52   
 9 EW40  34   
10 DC06  NA   
11 NA    NA   
12 IW28  NA 

Here's a solution based on zoo 's function na.locf ("in the case of NA, last observation carried forward"): first you group by Code then you mutate column Age using ifelse and carrying the last non- NA` observation forward:这是一个基于zoo的 function na.locf的解决方案(“在 NA 的情况下,最后一次观察结转”):首先你按Code分组,然后你using ifelse mutate column Age 并向前and carrying the last non- NA` 观察:

library(zoo)
data %>%
  group_by(Code) %>%
  mutate(Age = ifelse(is.na(Age), na.locf(Age), Age))
# A tibble: 12 x 2
# Groups:   Code [10]
   Code  Age  
   <chr> <chr>
 1 MW68  52   
 2 AW80  26   
 3 EW40  34   
 4 BW60  26   
 5 Wn36  20   
 6 ZK45  35   
 7 SI55  NA   
 8 MW68  52   # <- value `carried forward`
 9 EW40  34   # <- value `carried forward`
10 DC06  NA   
11 NA    NA   
12 IW28  NA  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据r中的ID从一列中查找另一列中的值 - Find values from one column in another column according to ID in r 如何根据 R 中的另一列获取一列的所有值? - How to get all values of one column based on another column in R? 将唯一值从一列复制到另一列 - Copy unique values from one column to another R 根据 R 中另一列中的 ID 分配一列中的 ID - Assign an ID in one column based on the ID in another column in R 如何根据 R 中另一列的值计算一列中最常见的变量? - How to calculated the most common variable from one column based on values from another column in R? 根据 R 中的另一列更改一列中的值 - Changing values in one column based on another in R 如何根据一列中的值对数据进行装箱,并汇总R中另一列中的出现次数? - How to bin data based on values in one column, and sum occurrences from another column in R? 如何根据R中的条件将数据从列复制到另一列? - How to copy data from a column to another based on a condition in R? 在 R 中,如何根据特定的行/列条件有选择地将一个单元格“复制并粘贴”到另一个单元格中? - In R, how do I selectively 'copy and paste' a cell into another cell based on specific row/column criteria? 如何基于另一列的值聚合一列的R数据帧 - How to aggregate R dataframe of one column based on values of another
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM