简体   繁体   中英

R unlisting nested row values

Hi I have a dataframe that contains multiple values as a list for some rows.

var1
A8
A9
c("A1", "A1", "D3")
c("A1", "D1")
c("D1", "D1")
c("D2", "A2")
c("D5", "A1")

I'm trying to 'unlist' the rows with multiple values by keeping the first observation. I've been playing around with the unlist command without any luck. What is the easiest way to accomplish this task.

As indicated in the comments, the column has to be first coerced (converted) to character class from current factor class using as.character .

This can be avoided at the file reading stage by using parameter stringsAsFactors=FALSE

Splitting each row and retaining only first value can be done with:

copyDF$Var1 = sapply(strsplit(copyDF$Var1,","),head,1)

Let us know if this works:

#user input data with factor class
userDF = structure(list(Var1 = structure(1:6, .Label = c("", "B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "c(\"B1\", \"B1\")", "c(\"B3\", \"B4\")", "c(\"B4\", \"B2\")"), class = "factor"), Freq = c(2538L, 633L, 458L, 328L, 135L, 56L)), .Names = c("Var1", "Freq"), row.names = c(NA, 6L), class = "data.frame")
userDF
#  Var1 Freq
#1      2538
#2   B1  633
#3   B2  458
#4   B3  328
#5   B4  135
#6   B5   56

str(userDF)
#   'data.frame':   6 obs. of  2 variables:
#$ Var1: Factor w/ 12 levels "","B1","B2","B3",..: 1 2 3 4 5 6
#$ Freq: int  2538 633 458 328 135 56

#Since userDF had no multiple values, adding them here
newDF = structure(list(Var1 = structure(1:6, .Label = c("B1,B2,B3", "B4,B5", "B6,B7,B8", "B3", "B4", "B5", "B6", "B7", "B8", "c(\"B1\", \"B1\")", "c(\"B3\", \"B4\")", "c(\"B4\", \"B2\")"), class = "factor"), Freq = c(2538L, 633L, 458L, 328L, 135L, 56L)), .Names = c("Var1", "Freq"), row.names = c(NA, 6L), class = "data.frame")
newDF
#      Var1 Freq
#1 B1,B2,B3 2538
#2    B4,B5  633
#3 B6,B7,B8  458
#4       B3  328
#5       B4  135
#6       B5   56


#Make a copy of the dataset
copyDF = newDF

#Var1 is of class factor which is not amenable for string operations,hence convert to character class
copyDF$Var1 = as.character(copyDF$Var1)

#Split each row, unlist and retain only first value

copyDF$Var1 = sapply(strsplit(copyDF$Var1,","),head,1)

copyDF
#  Var1 Freq
#1   B1 2538
#2   B4  633
#3   B6  458
#4   B3  328
#5   B4  135
#6   B5   56

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM