简体   繁体   English

根据R中另一个df2的信息添加列到dataframe

[英]Add column to dataframe based on information from another df2 in R

lets, say I have 2 dataframes like this:假设我有 2 个这样的数据框:

Model <- c("H5", "H5", "H5","H4","H3")
Code <- c("001001", "001002","001003","001004","001005")
City <-  c("Mexico", "London", "NY", "Otawa", "Liverpool")

df1 <- data.frame(Model,Length,Code)


Model   Code       City
H5      001001     Mexico  
H5      001002     London
H5      001003     NY
H4      001004     Otawa
H3      001005     Liverpool

And

X <- c("030299", "010121","030448","030324","010245","001001", "001002","001003","001004","001005")
Y <- c("030344", "010222","030448","030001","010245","221001", "221044","221044","221004"," 001005")
Var1 <- c("H5", "H5", "H4","H4","H4","H5", "H5", "H5","H4","H3")
Var2 <- c("H4", "H2", "H4","H3","H4","H3", "H3", "H3","H3","H3")

  df2 <- data.frame(X,Y,Var1,Var2)

  X            Y     VAR1   VAR2
030299      030344    H5     H4
010121      010222    H5     H2
030448      030448    H4     H4
030324      030001    H4     H3
010245      010245    H4     H4
001001      221001    H5     H3
001002      221044    H5     H3
001003      221044    H5     H3
001004      221004    H4     H3
001005      001005    H3     H3

I want to code following:我想编写以下代码:

For example if I select H3 as an argument in function, I want to take all values from 'Code' column in df1, take into account its corresponding value in 'Model' column and convert these value from 'Code' column based on df2 information.例如,如果我 select H3 作为 function 中的参数,我想从 df1 的“代码”列中获取所有值,考虑到它在“模型”列中的相应值,并根据 df2 信息从“代码”列转换这些值. For example if we select the first row from df1 and set H3 as argument:例如,如果我们 select 来自 df1 的第一行并将 H3 设置为参数:

  H5      001001    Mexico 

function must take corresponding row from df2: function 必须从 df2 中获取相应的行:

   X            Y     VAR1   VAR2
 001001      221001    H5     H3

and give me the output like this:然后像这样给我 output:

   X            Y    VAR2  City   
 001001      221001   H3   Mexico   

The final output should be like this:最后的output应该是这样的:

  X            Y     VAR2   City 

001001      221001    H3   Mexico  
001002      221044    H3   London  
001003      221044    H3   NY
001004      221004    H3   Otawa  
001005      221056    H3   Liverpool 

Maybe something to begin with, this reproduces the result of your example.也许从一开始,这将重现您示例的结果。

df2 %>% 
  left_join(df1, by = c( "Var1" = "Model", "X" = "Code")) %>% 
  filter(Var2 == "H3", !is.na(City)) %>% 
  select(-Var1)

       X       Y Var2      City
1 001001  221001   H3    Mexico
2 001002  221044   H3    London
3 001003  221044   H3        NY
4 001004  221004   H3     Otawa
5 001005  001005   H3 Liverpool

Following your logic I tried to create a custom function with base R: It takes 3 arguments: df1, df2, x x is the number of rows you want to calculate.按照您的逻辑,我尝试创建一个基数为 R 的自定义 function:需要 3 个 arguments: df1, df2, x x是您要计算的行数。 So you can select all columns or just one as you explained in your example.因此,您可以 select 所有列或仅一列,如您在示例中所解释的那样。

my_function <- function(df1, df2, x){
select_row <- df1[x,]
cbind(df2[X==select_row[,2],c(1:2, 4)],select_row[3])
}

my_function(df1, df2, 1:5)
        X       Y Var2      City
6  001001  221001   H3    Mexico
7  001002  221044   H3    London
8  001003  221044   H3        NY
9  001004  221004   H3     Otawa
10 001005  001005   H3 Liverpool

Like this?像这样?

library(data.table)
setDT(df1);setDT(df2)
df2[df1, on = .(Var1 = Model, X = Code)]
#         X       Y Var1 Var2      City
# 1: 001001  221001   H5   H3    Mexico
# 2: 001002  221044   H5   H3    London
# 3: 001003  221044   H5   H3        NY
# 4: 001004  221004   H4   H3     Otawa
# 5: 001005  001005   H3   H3 Liverpool

An alternative approach:另一种方法:

Data数据

library(tidyverse)

Model <- c("H5", "H5", "H5", "H4", "H3")
Code <- c("001001", "001002", "001003", "001004", "001005")
City <- c("Mexico", "London", "NY", "Otawa", "Liverpool")

df1 <- data.frame(Model, Code, City)

X <- c("030299", "010121", "030448", "030324", "010245", "001001", "001002", "001003", "001004", "001005")
Y <- c("030344", "010222", "030448", "030001", "010245", "221001", "221044", "221044", "221004", " 001005")
Var1 <- c("H5", "H5", "H4", "H4", "H4", "H5", "H5", "H5", "H4", "H3")
Var2 <- c("H4", "H2", "H4", "H3", "H4", "H3", "H3", "H3", "H3", "H3")

df2 <- data.frame(X, Y, Var1, Var2)

Function Function

my_fun <- function(row, var2) {
  df1_data <- df1 %>% slice(row)
  df2 %>%
    filter(Var2 == var2 & X == df1_data$Code) %>%
    mutate(df1_data$City)
}

1:nrow(df1) %>%
  map_dfr(~ my_fun(.x, "H3"))
#>        X       Y Var1 Var2 df1_data$City
#> 1 001001  221001   H5   H3        Mexico
#> 2 001002  221044   H5   H3        London
#> 3 001003  221044   H5   H3            NY
#> 4 001004  221004   H4   H3         Otawa
#> 5 001005  001005   H3   H3     Liverpool

Created on 2022-04-14 by the reprex package (v2.0.1)reprex package (v2.0.1) 创建于 2022-04-14

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据 df1 和 df2 之间的匹配,将列从 df2 添加到 df1 - Add column from df2 to df1 based on match between df1 and df2 根据 r 中 df1 和 df2 之间的匹配,在 df1 中添加一个新列 - Add a new column in df1 based on match between df1 and df2 in r 在 df2 中识别 df1 中的元素,然后在 df2 中使用 R 重合的那些行中添加列 - Identify elements from df1 in df2, then add column in df2 in those rows that were coincident using R 如何在df1中添加基于df2的两个CONSECUTIVE列之间的间隔进行插值的新列(df2 $`5`,df2 $`15`,df2 $`25`,df2 $`35`) - How to add a new column in df1 that is an interpolation based on intervals between two CONSECUTIVE columns of df2 (df2$`5`,df2$`15`,df2$`25`,df2$`35`) R-根据原始数据和汇总df中的列将计算列添加到汇总数据框中 - R - Add a calculated column to a summarized dataframe based on raw data and column from summarized df R,根据 df2 中的列更改 df1 中的行名(匹配名称) - R, change rownames in df1 based on column in df2 (matching names) 添加来自另一个 dataframe R 的信息的列 - Adding column with information from another dataframe R 在R中分配df $ COLUMN [x] = df2 $ COLUMN [y]的问题 - Problem with assign df$COLUMN[x] = df2$COLUMN[y] in R 如何在 dataframe 中使用来自另一个 dataframe 的信息在 R 中添加不同长度的列? - How to add a column in a dataframe using information from another dataframe with different lengths in R? 通过使用R匹配df1和df2中列的模式来更新df2中的列 - Update a column in df2 by matching patterns in columns in df1 & df2 using R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM