I have an input file like:
1A10, 77002, 77003, 77010, 77020
1A20, 77002, 77006, 77007, 77019
1A30, 77006, 77019, 77098
1A40, 77007, 77019, 77027, 77098
1A50, 77005, 77007, 77019, 77024, 77027, 77046, 77081, 77098, 77401
etc....
I want to create a data frame (tibble) where the first column is the same as the first column of my csv, and the second column is a list corresponding to the rest of the columns.
I have failed miserably. Here is my last failure
library(stringr)
library(tidyverse)
options(stringsAsFactors = FALSE)
infile <- "~/Rprojects/CrimeStats/BeatZipcodes.csv"
# create empty data frame
BeatToZip <- data_frame(
beat=character(),
zips=list()
)
con=file(infile,open="r")
line=readLines(con)
long=length(line)
for (i in 1:long){
print(line[i])
line[i] <- trimws(line[i])
beat <- str_split(line[i],", *")[[1]][1]
zips <- as.list(str_split(line[i],", *")[[1]][-1])
temp <- data_frame(beat, zips)
BeatToZip <- rbind(BeatToZip, temp)
}
close(con)
One option after reading the file with read.csv
and fill = TRUE
library(tidyverse)
df1 <- read.csv(infile, fill = TRUE, header = FALSE)
gather
all the columns except the first
one, grouped by the first column, summarise
the other columns into a list
df1 %>%
gather(key, val, -1, na.rm = TRUE) %>%
group_by(key) %>%
summarise(listCol = list(val))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.