简体   繁体   中英

R: How to re-concatenate a split string in a data frame

I created a data frame using the following:

Student <- c("John Davis","Angela Williams","Bullwinkle Moose","David Jones",
"Janice Markhammer","Cheryl Cushing","Reuven Ytzrhak","Greg Knox","Joel England",
"Mary Rayburn")
Math <- c(502,600,412,358,495,512,410,625,573,522)
Science <- c(95,99,80,82,75,85,80,95,89,86)
English <- c(25,22,18,15,20,28,15,30,27,18)
student.exam.data <- data.frame(Student,Math,Science,English)

I then split "John Davis" via student.exam.data$Student[1] <- strsplit(as.character(student.exam.data$Student[1]), " ", fixed = FALSE) into c("John", "Davis") .

I'm now attempting to re-concatenate the two characters into a single "John Davis" string. I've tried paste(student.exam.data$Student[1], collapse = " ") , paste(as.vector(student.exam.data$Student[1]), collapse = " ") , and toString(student.exam.data$Student[1]) . All three return "c(\\"John\\", \\"Davis\\")" .

Firstly, why do these return the backslashes, and secondly, what would be the appropriate way to approach this?

The problem is that the line

student.exam.data$Student[1] <- strsplit(as.character(student.exam.data$Student[1]), " ", fixed = FALSE)

Transforms the first variable in your dataframe into a list ---

str(student.exam.data)
'data.frame':   10 obs. of  4 variables:
 $ Student:List of 10
..$ : chr  "John" "Davis"
..$ : chr "Angela Williams"
..$ : chr "Bullwinkle Moose"
..$ : chr "David Jones"
..$ : chr "Janice Markhammer"
..$ : chr "Cheryl Cushing"
..$ : chr "Reuven Ytzrhak"
..$ : chr "Greg Knox"
..$ : chr "Joel England"
..$ : chr "Mary Rayburn"
$ Math   : num  502 600 412 358 495 512 410 625 573 522
$ Science: num  95 99 80 82 75 85 80 95 89 86
$ English: num  25 22 18 15 20 28 15 30 27 18

As such the first element has two values. This works to recombine in the literal sense of answering your question --

student.exam.data$Student[1]<-paste(student.exam.data$Student[1][[1]],student.exam.data$Student[1][[2]])

What it doesn't do is remedy the fact that your first variable is still a list.

You may find it more convenient to use tidyr::separate() and unite() .

Example:

library(tidyr)

student.exam.data %>% separate(Student, c('first_name','last_name')) -> d2

head(d2,3)

returns:

  first_name last_name Math Science English
1       John     Davis  502      95      25
2     Angela  Williams  600      99      22
3 Bullwinkle     Moose  412      80      18

likewise:

d2 %>% unite('full_name', first_name, last_name, sep=' ') -> d3

head(d3, 3)

returns:

         full_name Math Science English
1       John Davis  502      95      25
2  Angela Williams  600      99      22
3 Bullwinkle Moose  412      80      18

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM