简体   繁体   中英

Split String into factor using R

I sent out a fun questionnaire to our office to get some data for putting together a workflow for handling questionnaires in future. Some of the questions had textual input, and the responses were comma separated lists. The data were collected using a Google form, so they ended up in a spreadsheet. I'm linking directly to this spreadsheet to get the data into R so I'd prefer not to do any more pre-processing on the data than I have to.

Because the csv coming into R is comma separated too I swap the commas for pipes ('|'). I'd like to make bar charts out of the responses to questions like "what's your favorite piece of industrial design", but lots of people have said things like "iPhone, coke bottle". This comes up for me as a bar labeled with iPhone|coke bottle.

I'd like to split it up so that the iPhone part contributes to the iPhone bar etc. In other languages I'd concatenate the whole list with a pipe separator, then split it again on the pipes then work with that new list. I'm stuck trying this approach in R; is it the right way to go or is there a more R way to do it?

a <- BVNdhData$Pets
b <- paste(a,collapse ="|")
c <- strsplit(b,"|",fixed=TRUE)

that all works, but leaves me with a list that I have no idea what to do with.

If you call unlist() on the results of strsplit() you get a single character vector with all of the components of your text:

text <- c("cake|pie|sausage roll", "scotch egg|pie")
x <- unlist(strsplit(text, "\\|"))

Use table() to tabulate the entries:

table(x)

x
        cake          pie sausage roll   scotch egg 
           1            2            1            1 

Then coerce it to a data frame...

dat <- as.data.frame(table(x))
dat


             x Freq
1         cake    1
2          pie    2
3 sausage roll    1
4   scotch egg    1

... and plot:

library(ggplot2)
ggplot(dat, aes(x, Freq)) + geom_point()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM