简体   繁体   中英

In R, how can I copy rows from one dataframe to another when the df being copied to has 2 additional columns?

I have a tab delimited text file with 12 columns that I am uploading to my program. I go on to create another dataframe with a structure similar to the one uploaded and add 2 more columns to it.

excelfile = read.delim(ExcelPath)
matchedPictures<- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()

Now I have a function in which I do the following:

  1. Based on a condition, I obtain the row number pictureMatchNum of the row I need to copy from excelfile to matchedPictures .
  2. I should then copy the row from excelfile to matchedPictures . I tried a couple of different ways so far.

    a.

     rowNumber = nrow(matchedPictures) + 1 matchedPictures[rowNumber,1:12] <<- excelfile[pictureMatchNum,1:12]

    b.

    matchedPictures[rowNumber,1:12] <<- rbind(matchedPictures, excelfile[pictureWordMatches,1:12], make.row.names = FALSE)

2a. doesn't seem to work because it copies the indices from the excelfile and uses them as row names in the matchedPictures - which is why I decided to go with rbind

2b. doesn't seem to work because rbind needs to have the columns be identical and matchedPictures has 2 extra columns.

EDIT START - Including reproducible example.

Here is some reproducible code (with fewer columns and fake data)

excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)] )
matchedPictures <- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()

pictureMatchNum1 = match(1, str_detect("A", regex(excelfile$x, ignore_case = TRUE)))
rowNumber1 = nrow(matchedPictures) + 1

pictureMatchNum2 = match(1, str_detect("D", regex(excelfile$x, ignore_case = TRUE)))
rowNumber2 = nrow(matchedPictures) + 1

The 2 options I tried are

2a.

matchedPictures[rowNumber1,1:3] <<- excelfile[pictureMatchNum1,1:3]
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"

matchedPictures[rowNumber2,1:3] <<- excelfile[pictureMatchNum2,1:3]
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"

OR

2b.

matchedPictures[rowNumber1,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum1,1:3], make.row.names = FALSE)
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"

matchedPictures[rowNumber2,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum2,1:3], make.row.names = FALSE)
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"

EDIT END

Additionally, I have also seen the suggestions in many places that rather than using empty dataframes, one should have vectors and append data to the vectors and then combine them into a dataframe. Is this suggestion valid when I have so many columns and would need to have 14 separate vectors and copy each one of them individually?

What can I do to make this work?

You could

  • first determine the row indices of excelfile that match your criteria
  • extract these rows
  • then generate the data to fill your columns beforeName and afterName
  • then append these columns to your new data frame

Example:

excelfile <- data.frame(x = letters, y = words[length(letters)], 
    z = fruit[length(letters)])
    ## Vector of patterns:
patternVec <- c("A", "D", "M")
## Look for appropriate rows in file 'excelfile':
indexVec <- vapply(patternVec, 
        function(myPattern) which(str_detect(myPattern, 
                    regex(excelfile$x, ignore_case = TRUE))), integer(1))
## Extract these rows:
matchedPictures <- excelfile[indexVec,]
## Somehow generate the data for columns 'beforeName' and 'afterName':
## I do not know how this information is generated so I just insert 
## some dummy code here:
beforeNameVec <- c("xxx", "uuu", "mmm")
afterNameVec <- c("yyy", "www", "nnn")
## Then assign these variables:
matchedPictures$beforeName <- beforeNameVec
matchedPictures$afterName <- afterNameVec

matchedPictures
# x   y           z beforeName afterName
# a air dragonfruit        xxx       yyy
# d air dragonfruit        uuu       www
# m air dragonfruit        mmm       nnn

You can make this much simpler by using dplyr

library(dplyr)
library(stringr)

excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)],
stringsAsFactors = FALSE ) #add stringsAsFactors to have character columns

pictureMatch <- excelfile %>%
  #create a match column
  mutate(match = ifelse(str_detect(x,"a") | str_detect(x,'d'),1,0)) %>% 
  #filter to only the columns that match your condition
  filter(match ==1)

pictureMatch <- pictureMatch[['x']] #convert to a vector

matchedPictures <- excelfile %>%
  filter(x %in% pictureMatch) %>% #grab the rows that match your condition
  mutate(beforeName = c('xxx','uuu'), #add your names
     afterName = c('yyy','www'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM