简体   繁体   中英

In R, trying to convert a ragged CSV into data.frame of Value, list

I have an input file like:

1A10, 77002, 77003, 77010, 77020
1A20, 77002, 77006, 77007, 77019
1A30, 77006, 77019, 77098
1A40, 77007, 77019, 77027, 77098
1A50, 77005, 77007, 77019, 77024, 77027, 77046, 77081, 77098, 77401
etc....

I want to create a data frame (tibble) where the first column is the same as the first column of my csv, and the second column is a list corresponding to the rest of the columns.

I have failed miserably. Here is my last failure

library(stringr)
library(tidyverse)

options(stringsAsFactors = FALSE)

infile <- "~/Rprojects/CrimeStats/BeatZipcodes.csv"

# create empty data frame
BeatToZip <- data_frame(
    beat=character(),
    zips=list()
)

con=file(infile,open="r")
line=readLines(con) 
long=length(line)
for (i in 1:long){
    print(line[i])
    line[i] <- trimws(line[i])
    beat <- str_split(line[i],", *")[[1]][1]
    zips <- as.list(str_split(line[i],", *")[[1]][-1])
    temp <- data_frame(beat, zips)
    BeatToZip <- rbind(BeatToZip, temp)
}
close(con)

One option after reading the file with read.csv and fill = TRUE

library(tidyverse)
df1 <- read.csv(infile, fill = TRUE, header = FALSE)

gather all the columns except the first one, grouped by the first column, summarise the other columns into a list

df1 %>%
   gather(key, val, -1, na.rm = TRUE) %>%
   group_by(key) %>%
   summarise(listCol = list(val))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM