简体   繁体   中英

Extracting data from list in R

library(RCurl)
library(rjson)
json <- getURL('https://extraction.import.io/query/runtime/17d882b5-c118-4f27-8ce1-90085ec0b116?_apikey=d5a8a01e20174e95887dc0f385e4e3f6d7ef5ca1428d5a029f2aa352509948ade8e5d7fb0dc941f4769a32b541ca6b38a7cd6578dfd81b357fbc4f2e008f5154f1dbfcff31878798fa887b70b1ff59dd&url=http%3A%2F%2Fwww.numbeo.com%2Fcost-of-living%2Fcompare_cities.jsp%3Fcountry1%3DSingapore%26country2%3DAustralia%26city1%3DSingapore%26city2%3DMelbourne')
obj <- fromJSON(json)

I would like to get the data into nice columns of data, but many steps in the list are "nameless". Any idea of how to organise the data?

Check out this difference, and let me know what you think. This is what your object looks like:

library(RCurl)
library(rjson)
json <- getURL('https://extraction.import.io/query/runtime/17d882b5-c118-4f27-8ce1-90085ec0b116?_apikey=d5a8a01e20174e95887dc0f385e4e3f6d7ef5ca1428d5a029f2aa352509948ade8e5d7fb0dc941f4769a32b541ca6b38a7cd6578dfd81b357fbc4f2e008f5154f1dbfcff31878798fa887b70b1ff59dd&url=http%3A%2F%2Fwww.numbeo.com%2Fcost-of-living%2Fcompare_cities.jsp%3Fcountry1%3DSingapore%26country2%3DAustralia%26city1%3DSingapore%26city2%3DMelbourne')
obj <- rjson::fromJSON(json)
str(obj)

List of 2
 $ extractorData:List of 3
  ..$ url       : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
  ..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
  ..$ data      :List of 1
  .. ..$ :List of 1
  .. .. ..$ group:List of 52
  .. .. .. ..$ :List of 6
  .. .. .. .. ..$ COL VALUE        :List of 1
  .. .. .. .. .. ..$ :List of 1
  .. .. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"

Indeed a lot of Lists in between there that you don't need. Now try the jsonlite package's fromJSON function:

library(jsonlite)
obj2<- jsonlite::fromJSON(json)

List of 2
 $ extractorData:List of 3
  ..$ url       : chr "http://www.numbeo.com/cost-of-living/compare_cities.jsp?country1=Singapore&country2=Australia&city1=Singapore&city2=Melbourne"
  ..$ resourceId: chr "b1250747011ee774e7c881617c86a5a9"
  ..$ data      :'data.frame':  1 obs. of  1 variable:
  .. ..$ group:List of 1
  .. .. ..$ :'data.frame':  52 obs. of  6 variables:
  .. .. .. ..$ COL VALUE        :List of 52
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:
  .. .. .. .. .. ..$ text: chr "Meal, Inexpensive Restaurant"
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:
  .. .. .. .. .. ..$ text: chr "Meal for 2 People, Mid-range Restaurant, Three-course"
  .. .. .. .. ..$ :'data.frame':    1 obs. of  1 variable:

Still though, this JSON just isn't pretty, we'll need to fix this. I take it you want that data frame in there. So start with

df <- obj2$extractorData$data$group[[1]]

and there's your data frame. Problem though: every single cell is in a list here. Including NULL values, and you can't just unlist those, they'll disappear and the columns in which they were will grow shorter...

Edit: Here's how to handle the columns with list(NULL) values.

df[sapply(df[,2],is.null),2] <- NA
df[sapply(df[,3],is.null),3] <- NA
df[sapply(df[,4],is.null),4] <- NA
df[sapply(df[,5],is.null),5] <- NA
df2 <- sapply(df, unlist) %>% as.data.frame

It can be written more elegantly for sure, but this'll get you going and it's understandable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM