[英]Replacing the values from another data from based on the information in the first column in R
I'm trying to merge informations in two different data frames, but problem begins with uneven dimensions and trying to use not the column index but the information in the column. 我试图将信息合并到两个不同的数据框中,但问题始于尺寸不均,并尝试不使用列索引而是使用列中的信息。 merge function in R or join's (dplyr) don't work with my data. R或join(dplyr)中的merge函数不适用于我的数据。
I have to dataframes (One is subset of the others with updated info in the last column): 我必须使用数据框(其中一个是其他子集的一部分,最后一列中包含更新的信息):
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"))
Name val Case
1 A 1 NA
2 B 2 1
3 C 3 NA
4 D 1 NA
5 E 2 1
6 F 3 NA
7 G 1 1
8 H 2 NA
9 I 3 NA
Some rows in the Case
column in df1
have to be changed with the info in the df2
below: df1
中Case
列中的某些行必须使用以下df2
中的信息进行更改:
df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1")
Name val Case
1 A 1 1
2 D 2 1
3 H 3 1
So there's nothing important in the val
column, however I added it into the examples since I want to indicate that I have more columns than two and also my real data is way bigger than the examples. 因此val
列中没有什么重要的,但是我将其添加到示例中是因为我想表明我的列多于两个,而且我的实际数据比示例大得多。
Basically, I want to change specific rows by checking the information in the first columns (in this case, they're unique letters) and in the end I still want to have df1
as a final data frame. 基本上,我想通过检查第一列中的信息来更改特定的行(在这种情况下,它们是唯一的字母),最后,我仍然希望将df1
作为最终数据帧。
for a better explanation, I want to see something like this: 为了获得更好的解释,我想看到以下内容:
Name val Case
1 A 1 1
2 B 2 1
3 C 3 NA
4 D 1 1
5 E 2 1
6 F 3 NA
7 G 1 1
8 H 2 1
9 I 3 NA
Note changed information for A
, D
and H
. 注意已更改的A
, D
和H
。
Thanks. 谢谢。
%in%
from base-r is there to rescue. 来自base-r的%in%
可以救援。
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"), stringsAsFactors = F)
df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1", stringsAsFactors = F)
df1$Case <- ifelse(df1$Name %in% df2$Name, df2$Case[df2$Name %in% df1$Name], df1$Case)
df1
Output:
> df1
Name val Case
1 A 1 1
2 B 2 1
3 C 3 NA
4 D 1 1
5 E 2 1
6 F 3 NA
7 G 1 1
8 H 2 1
9 I 3 NA
Here is what I would do using dplyr
: 这是我将使用dplyr
:
df1 %>%
left_join(df2, by = c("Name")) %>%
mutate(val = if_else(is.na(val.y), val.x, val.y),
Case = if_else(is.na(Case.y), Case.x, Case.y)) %>%
select(Name, val, Case)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.