简体   繁体   中英

JSON to R not reading column names

I'm trying to pull some data from the US Census website, which comes in JSON. This is what it looks like:

data_from_api <- readr::read_file('https://api.census.gov/data/2016/zbp?get=ESTAB,EMPSZES,EMPSZES_TTL,ST,YEAR&for=ZIPCODE:20004')

data_from_api

Trying to use jsonlite it looks like this

> data_from_api <- fromJSON(data_from_api)
> data_from_api
      [,1]    [,2]      [,3]                                          [,4] [,5]   [,6]     
 [1,] "ESTAB" "EMPSZES" "EMPSZES_TTL"                                 "ST" "YEAR" "zipcode"
 [2,] "925"   "001"     "All establishments"                          "11" "2016" "20004"  
 [3,] "406"   "212"     "Establishments with 1 to 4 employees"        "11" "2016" "20004"  
 [4,] "154"   "220"     "Establishments with 5 to 9 employees"        "11" "2016" "20004"  
 [5,] "113"   "230"     "Establishments with 10 to 19 employees"      "11" "2016" "20004"  
 [6,] "122"   "241"     "Establishments with 20 to 49 employees"      "11" "2016" "20004"  
 [7,] "70"    "242"     "Establishments with 50 to 99 employees"      "11" "2016" "20004"  
 [8,] "45"    "251"     "Establishments with 100 to 249 employees"    "11" "2016" "20004"  
 [9,] "8"     "252"     "Establishments with 250 to 499 employees"    "11" "2016" "20004"  
[10,] "6"     "254"     "Establishments with 500 to 999 employees"    "11" "2016" "20004"  
[11,] "1"     "260"     "Establishments with 1,000 employees or more" "11" "2016" "20004" 

Any idea why the column names are not flowing properly? Can I change any input to make it work?

Thanks

This is not because of some fault with fromJSON, it's just a matter of the randomness of JSON structures.

It's trivial to convert this to a correctly named data.frame:

colnms <- data_from_api[1,]
data_from_api <- as.data.frame(data_from_api[-1,], check.names = F, stringsAsFactors = FALSE)
names(data_from_api) <- colnms

It is delivered as a list of lists (ie, a matrix), not a dictionary (frame). To get the frame, some simple manipulation:

x <- jsonlite::fromJSON(data_from_api)
x
#       [,1]    [,2]      [,3]                                          [,4] [,5]   [,6]     
#  [1,] "ESTAB" "EMPSZES" "EMPSZES_TTL"                                 "ST" "YEAR" "zipcode"
#  [2,] "925"   "001"     "All establishments"                          "11" "2016" "20004"  
#  [3,] "406"   "212"     "Establishments with 1 to 4 employees"        "11" "2016" "20004"  
#  [4,] "154"   "220"     "Establishments with 5 to 9 employees"        "11" "2016" "20004"  
#  [5,] "113"   "230"     "Establishments with 10 to 19 employees"      "11" "2016" "20004"  
#  [6,] "122"   "241"     "Establishments with 20 to 49 employees"      "11" "2016" "20004"  
#  [7,] "70"    "242"     "Establishments with 50 to 99 employees"      "11" "2016" "20004"  
#  [8,] "45"    "251"     "Establishments with 100 to 249 employees"    "11" "2016" "20004"  
#  [9,] "8"     "252"     "Establishments with 250 to 499 employees"    "11" "2016" "20004"  
# [10,] "6"     "254"     "Establishments with 500 to 999 employees"    "11" "2016" "20004"  
# [11,] "1"     "260"     "Establishments with 1,000 employees or more" "11" "2016" "20004"  


colnames(x) <- x[1,]
x <- x[-1,]
x2 <- as.data.frame(x, stringsAsFactors = FALSE)
x2[c(1,2,4,5,6)] <- lapply(x2[c(1,2,4,5,6)], as.integer)

str(x2)
# 'data.frame': 10 obs. of  6 variables:
#  $ ESTAB      : int  925 406 154 113 122 70 45 8 6 1
#  $ EMPSZES    : int  1 212 220 230 241 242 251 252 254 260
#  $ EMPSZES_TTL: chr  "All establishments" "Establishments with 1 to 4 employees" "Establishments with 5 to 9 employees" "Establishments with 10 to 19 employees" ...
#  $ ST         : int  11 11 11 11 11 11 11 11 11 11
#  $ YEAR       : int  2016 2016 2016 2016 2016 2016 2016 2016 2016 2016
#  $ zipcode    : int  20004 20004 20004 20004 20004 20004 20004 20004 20004 20004

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM