I am trying to convert the output of a Census API call (saved as an .rds file here ) into a R data frame object. For convenience, let's call the object 'x'.
Each element of x is a list
# Both of below return 'list' class(x[i]) class(x[[i]])
Each element of that list is either...
A character vector
# Returns 'list' class(x[[i]][k]) # Returns 'character' class(x[[i]][[k]])
A list
# Returns 'list' class(x[[i]][k]) # Returns 'list' class(x[[i]][[k]])
The determinant of whether the element is a list or a character vector is whether the the value "NULL" appears in the row of data. If one of the elements of the row is "NULL" then the element is a list. If none of the elements of the row is "NULL" then the element is a character vector.
If the above is a list, each element of the list is either of class "NULL" if the value is NULL or class character if the value is not "NULL"
# Returns 'list' class(x[[i]][[k]][g]) # Returns "NULL" if "NULL" else "character" class(x[[i]][[k]][[g]])
Can anyone propose a method for converting this into a data frame? I am having enormous difficulty with figuring out how to convert the block group elements into an object that I can apply() or loop across.
In response to requests for a reproducible example, see the below code. It demonstrates a small version of the data I have (my data contains many counties, black groups, and variables). Notice that the length of each block group vector or list equals the number of variables because the elements of the vector are the values of the block group for that respective variable. My goal is to produce a data frame with column names of var1, var2, var3, var4 while each row represents the values for a block group.
set.seed(5)
# County 1
bezz <- c("var1","var2","var3","var4") # variable names
bizz <- as.character(round(rnorm(4),2)) # block group 1.1
buzz <- list("NULL","NULL","2","94389") # block group 1.2
bozz <- as.character(round(rnorm(4),2)) # block group 1.3
bazz <- list("NULL","NULL","888888888","NULL") # block group 1.4
foo <- list(bezz, bizz,buzz,bozz,bazz) # county 1 object
# County 2
fezz <- c("var1","var2","var3","var4") # variable names
fizz <- list("NULL","2","NULL","94389") # block group 2.1
fuzz <- as.character(round(rnorm(4),2)) # block group 2.2
fozz <- as.character(round(rnorm(4),2)) # block group 2.3
bar <- list(fezz, fizz,fuzz,fozz) # county 2 object
# County 3
lezz <- c("var1","var2","var3","var4") # variable names
luzz <- as.character(round(rnorm(4),2)) # block group 3.1
baz <- list(lezz, luzz) # county 3 object
# API output
mydata <- list(foo,bar,baz) # all counties in a list
This solutions requires that all NULL
's be converted to NA
's. Since all data appear numerical, as.numeric()
has been used, just remove if not what you want.
This should take a while, maybe there are more efficient ways to go about this. The two loops could be made into one, but for the sake of clarity the NULL
to NA
loop has been kept separate.
have <- readRDS("~/R/SO/acs0509_block_group_call.Rds")
# replace NULL's with NA's
for(i in seq_along(have)) {
for(j in seq_along(have[[i]])) {
for(k in seq_along(have[[i]][[j]])) {
have[[i]][[j]][[k]] <- ifelse(is.null(have[[i]][[j]][[k]]),NA,have[[i]][[j]][[k]])
}
}
}
# initiate "want" data.frame with an arbitrary row
want <- as.data.frame(t(as.numeric(have[[1]][[2]])))
colnames(want) <- have[[1]][[1]]
ins.row <- 1
for(i in 1:length(have)) {
for(j in 2:(length(have[[i]]))) {
if(is.list(have[[i]][[j]]))
want[ins.row,] <- as.numeric(unlist(have[[i]][[j]]))
else
want[ins.row,] <- as.numeric(have[[i]][[j]])
ins.row <- ins.row + 1
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.