[英]Add column to dataframe based on information from another df2 in R
假設我有 2 個這樣的數據框:
Model <- c("H5", "H5", "H5","H4","H3")
Code <- c("001001", "001002","001003","001004","001005")
City <- c("Mexico", "London", "NY", "Otawa", "Liverpool")
df1 <- data.frame(Model,Length,Code)
Model Code City
H5 001001 Mexico
H5 001002 London
H5 001003 NY
H4 001004 Otawa
H3 001005 Liverpool
和
X <- c("030299", "010121","030448","030324","010245","001001", "001002","001003","001004","001005")
Y <- c("030344", "010222","030448","030001","010245","221001", "221044","221044","221004"," 001005")
Var1 <- c("H5", "H5", "H4","H4","H4","H5", "H5", "H5","H4","H3")
Var2 <- c("H4", "H2", "H4","H3","H4","H3", "H3", "H3","H3","H3")
df2 <- data.frame(X,Y,Var1,Var2)
X Y VAR1 VAR2
030299 030344 H5 H4
010121 010222 H5 H2
030448 030448 H4 H4
030324 030001 H4 H3
010245 010245 H4 H4
001001 221001 H5 H3
001002 221044 H5 H3
001003 221044 H5 H3
001004 221004 H4 H3
001005 001005 H3 H3
我想編寫以下代碼:
例如,如果我 select H3 作為 function 中的參數,我想從 df1 的“代碼”列中獲取所有值,考慮到它在“模型”列中的相應值,並根據 df2 信息從“代碼”列轉換這些值. 例如,如果我們 select 來自 df1 的第一行並將 H3 設置為參數:
H5 001001 Mexico
function 必須從 df2 中獲取相應的行:
X Y VAR1 VAR2
001001 221001 H5 H3
然后像這樣給我 output:
X Y VAR2 City
001001 221001 H3 Mexico
最后的output應該是這樣的:
X Y VAR2 City
001001 221001 H3 Mexico
001002 221044 H3 London
001003 221044 H3 NY
001004 221004 H3 Otawa
001005 221056 H3 Liverpool
也許從一開始,這將重現您示例的結果。
df2 %>%
left_join(df1, by = c( "Var1" = "Model", "X" = "Code")) %>%
filter(Var2 == "H3", !is.na(City)) %>%
select(-Var1)
X Y Var2 City
1 001001 221001 H3 Mexico
2 001002 221044 H3 London
3 001003 221044 H3 NY
4 001004 221004 H3 Otawa
5 001005 001005 H3 Liverpool
按照您的邏輯,我嘗試創建一個基數為 R 的自定義 function:需要 3 個 arguments: df1, df2, x
x
是您要計算的行數。 因此,您可以 select 所有列或僅一列,如您在示例中所解釋的那樣。
my_function <- function(df1, df2, x){
select_row <- df1[x,]
cbind(df2[X==select_row[,2],c(1:2, 4)],select_row[3])
}
my_function(df1, df2, 1:5)
X Y Var2 City
6 001001 221001 H3 Mexico
7 001002 221044 H3 London
8 001003 221044 H3 NY
9 001004 221004 H3 Otawa
10 001005 001005 H3 Liverpool
像這樣?
library(data.table)
setDT(df1);setDT(df2)
df2[df1, on = .(Var1 = Model, X = Code)]
# X Y Var1 Var2 City
# 1: 001001 221001 H5 H3 Mexico
# 2: 001002 221044 H5 H3 London
# 3: 001003 221044 H5 H3 NY
# 4: 001004 221004 H4 H3 Otawa
# 5: 001005 001005 H3 H3 Liverpool
另一種方法:
library(tidyverse)
Model <- c("H5", "H5", "H5", "H4", "H3")
Code <- c("001001", "001002", "001003", "001004", "001005")
City <- c("Mexico", "London", "NY", "Otawa", "Liverpool")
df1 <- data.frame(Model, Code, City)
X <- c("030299", "010121", "030448", "030324", "010245", "001001", "001002", "001003", "001004", "001005")
Y <- c("030344", "010222", "030448", "030001", "010245", "221001", "221044", "221044", "221004", " 001005")
Var1 <- c("H5", "H5", "H4", "H4", "H4", "H5", "H5", "H5", "H4", "H3")
Var2 <- c("H4", "H2", "H4", "H3", "H4", "H3", "H3", "H3", "H3", "H3")
df2 <- data.frame(X, Y, Var1, Var2)
my_fun <- function(row, var2) {
df1_data <- df1 %>% slice(row)
df2 %>%
filter(Var2 == var2 & X == df1_data$Code) %>%
mutate(df1_data$City)
}
1:nrow(df1) %>%
map_dfr(~ my_fun(.x, "H3"))
#> X Y Var1 Var2 df1_data$City
#> 1 001001 221001 H5 H3 Mexico
#> 2 001002 221044 H5 H3 London
#> 3 001003 221044 H5 H3 NY
#> 4 001004 221004 H4 H3 Otawa
#> 5 001005 001005 H3 H3 Liverpool
由reprex package (v2.0.1) 創建於 2022-04-14
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.