简体   繁体   中英

R for loop with if else on multiple data frames

Greetings and thanks in advance for all help I have many data frame that resemble the ones below

df1

   name info
1  john    A
2   jim    B
3   tom    B
4 bill     B

dframe

  name other
1  sam   pro
2  dad   mo1
3  mom  Bxxx

frame3

   name otherinfo
1   jus         A
2    do         7
3 r pro         B
4   sir         B
5  real        na
6  pete       yes

OLFrame

   name information
1  ally          x1
2   mom          B9
3 r pro         s3B
4   tom         Bd0
5 kelly          ot
6  jojo         who
7    na          11

I would like to :

  1. take each name from the "name" column of dataframe "OLFrame" and look into the "name"column of "df1" to see if the name exists
  2. create column vector with named "df1" consisting of "1" if name from "OLFrame" exist in "df1" if not "0"
  3. repeat the steps 1 and 2 but using "dframe" and "frame3"
  4. create a new data frame called "newOLFrame" consisting of "OLFrame" and and new columns named "df1", "dframe" and "frame3"

The desired result should look like

newOLFrame

   name information df1 dframe frame3
1  ally          x1   0      0      0
2   mom          B9   0      1      0
3 r pro         s3B   0      0      1
4   tom         Bd0   1      0      0
5 kelly          ot   0      0      0
6  jojo         who   0      0      0
7    na          11   0      0      0

I can do one at a time (below) but I have over a hundred files to look through

newOLFrame<-OLFrame
newOLFrame[,"pro1"]<-ifelse(newOLFrame$name %in% df12$name, 1, 0)

Please help. Thanks again

Consider an extended chain merge by first building a list of data frames, iteratively left joined to OLFrame then chain merge all together at end with Reduce :

df_list <- lapply(c("df1", "dframe", "frame3"), function(nm) {      
  df <- get(nm)
  df[[nm]] <- 1

  df <- merge(OLFrame, df[c("name", nm)], by="name", all.x=TRUE) 
  df[[nm]] = ifelse(is.na(df[[nm]]), 0, 1)

  return(df)
})

# MERGE ALL DFs
final_df <- Reduce(function(x, y) merge(x, y, by=c("name", "information")), df_list)
final_df
#    name information df1 dframe frame3
# 1  ally          x1   0      0      0
# 2  jojo         who   0      0      0
# 3 kelly          ot   0      0      0
# 4   mom          B9   0      1      0
# 5    na          11   0      0      0
# 6 r pro         s3B   0      0      1
# 7   tom         Bd0   1      0      0

Alternatively, consider a do.call as Reduce can have performance issues for large lists where you order data frame and then subset out only the needed column to column bind all data frame items at end:

df_list <- lapply(c("df1", "dframe", "frame3"), function(nm) {

  df <- get(nm)
  df[[nm]] <- 1

  df <- merge(OLFrame, df[c("name", nm)], by="name", all.x=TRUE, sort=FALSE) 
  df[[nm]] = ifelse(is.na(df[[nm]]), 0, 1)

  df <- with(df, df[order(name, information),])        # ORDER DATA FRAME
  small_df <- setNames(as.data.frame(df[[nm]]), nm)    # SUBSET ONE COLUMN

  return(small_df)
})

# ORDER DATA FRAME
OLFrame <- with(OLFrame, OLFrame[order(name, information),])

final_df <- do.call(cbind, c(OLFrame, df_list))
final_df

#    name information df1 dframe frame3
# 1  ally          x1   0      0      0
# 2  jojo         who   0      0      0
# 3 kelly          ot   0      0      0
# 4   mom          B9   0      1      0
# 5    na          11   0      0      0
# 6 r pro         s3B   0      0      1
# 7   tom         Bd0   1      0      0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM