[英]filling an empty column with a for loop based on another column in a different data frame
I have this data frame df and I would like to use this df to loop through my df2;我有这个数据框 df,我想用这个 df 循环遍历我的 df2;
Station_ID Station_Name
1 New York
2 London
3 Madrid
4 Rome
....
I have another data frame df2;我有另一个数据框 df2;
Station x1 x2
1 10 5
1 8 6
2 21 9
4 12 7
I would like to achieve;我想实现;
Station Station_Name x1 x2
1 New York 10 5
1 New York 8 6
2 London 21 9
4 Rome 12 7
What I have done so far;到目前为止我做了什么;
df2 <- df2 %>%
add_column(Station_Name = NA)
for (i in 1:nrow(df2$Station_Name)) {
if (df$Station_ID == df2$Station) {
df2$Sitation_Name <- df$Station_Name
}
}
Error in 1:nrow(df2$Station_Name): argument of length 0 1:nrow(df2$Station_Name) 中的错误:长度为 0 的参数
also just being curious, how would you I suggest I do if I had 5 different data frames instead and I had to write a loop which would go through all those different data frames to add their corresponding name?也只是好奇,如果我有 5 个不同的数据帧,我会建议我怎么做,我必须编写一个循环,将 go 通过所有这些不同的数据帧来添加它们相应的名称?
Instead of a loop the natural way would be to use a left_join
:代替循环的自然方法是使用
left_join
:
library(dplyr)
df2 <- left_join(df2, df, by = c("Station" = "Station_ID"))
df2
#> Station x1 x2 Station_Name
#> 1 1 10 5 New York
#> 2 1 8 6 New York
#> 3 2 21 9 London
#> 4 4 12 7 Rome
Or using base R:或者使用基数 R:
df2 <- merge(df2, df, by.x = "Station", by.y = "Station_ID", all.x)
df2
#> Station x1 x2 Station_Name
#> 1 1 10 5 New York
#> 2 1 8 6 New York
#> 3 2 21 9 London
#> 4 4 12 7 Rome
DATA数据
df <- structure(list(Station_ID = 1:4, Station_Name = c(
"New York",
"London", "Madrid", "Rome"
)), class = "data.frame", row.names = c(
NA,
-4L
))
df2 <- structure(list(Station = c(1L, 1L, 2L, 4L), x1 = c(
10L, 8L, 21L,
12L
), x2 = c(5L, 6L, 9L, 7L)), class = "data.frame", row.names = c(
NA,
-4L
))
As Stefan, I would also advise to use dplyr
here.作为 Stefan,我还建议在这里使用
dplyr
。 If you still wanted to do the loop, this would be my solution:如果你仍然想做循环,这将是我的解决方案:
#Loop
df <- data.frame(Station_ID= c(1:4),
Station_Name= c("NY", "Lon", "Mad", "Rome"))
df2 <- data.frame(Station= c(1:4),
X1= c(10,8,21,12),
X2= c(5,6,9,7))
for (i in 1:nrow(df2)) {
df2$Station_Name[i] <- df$Station_Name[i]
}
df2
#> Station X1 X2 Station_Name
#> 1 1 10 5 NY
#> 2 2 8 6 Lon
#> 3 3 21 9 Mad
#> 4 4 12 7 Rome
Created on 2022-12-25 with reprex v2.0.2创建于 2022-12-25,使用reprex v2.0.2
The issue you had, was that df$Station_Name
is actually just a vector and thus nrow()
cannot be applied.您遇到的问题是
df$Station_Name
实际上只是一个向量,因此无法应用nrow()
。
Using data.table
使用
data.table
library(data.table)
setDT(df2)[df, on = .(Station = Station_ID), nomatch = FALSE]
Station x1 x2 Station_Name
1: 1 10 5 New York
2: 1 8 6 New York
3: 2 21 9 London
4: 4 12 7 Rome
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.