[英]How to loop - JSONP / JSON data using R
我以為我已經使用jsonlite
& tidyjson
正確解析了數據。 但是,我注意到只有第一頁中的數據被解析。 請建議我如何正確解析所有頁面。 總頁數超過1300-如果我查看json
輸出,那么我認為數據可用,但未正確解析。
注意:我使用了tidyjson
,但是也可以使用jsonlite
或其他任何庫。
library(dplyr)
library(tidyjson)
library(jsonlite)
req <- httr::GET("http://svcs.ebay.com/services/search/FindingService/v1?OPERATION-NAME=findItemsByKeywords&SERVICE-VERSION=1.0.0&SECURITY-APPNAME=xxxxxx&GLOBAL-ID=EBAY-US&RESPONSE-DATA-FORMAT=JSON&callback=_cb_findItemsByKeywords&REST-PAYLOAD&keywords=harry%20potter&paginationInput.entriesPerPage=100")
txt <- content(req, "text")
json <- sub("/**/_cb_findItemsByKeywords(", "", txt, fixed = TRUE)
json <- sub(")$", "", json)
data1 <- json %>% as.tbl_json %>%
enter_object("findItemsByKeywordsResponse") %>% gather_array %>% enter_object("searchResult") %>% gather_array %>%
enter_object("item") %>% gather_array %>%
spread_values(
ITEMID = jstring("itemId"),
TITLE = jstring("title")
) %>%
select(ITEMID, TITLE) # select only what is needed
############################################################
*Note: "paginationOutput":[{"pageNumber":["1"],"entriesPerPage":["100"],"totalPages":["1393"],"totalEntries":["139269"]}]
* &_ipg=100&_pgn=1"
無需tidyjson
。 您將需要編寫另一個函數/一組調用才能獲得使用以下內容的總頁數(超過1,400),但這應該非常簡單。 嘗試對操作進行更多划分, httr
在可以參數httr
時使用httr
的全部功能:
library(dplyr)
library(jsonlite)
library(httr)
library(purrr)
get_pg <- function(i) {
cat(".") # shows progress
req <- httr::GET("http://svcs.ebay.com/services/search/FindingService/v1",
query=list(`OPERATION-NAME`="findItemsByKeywords",
`SERVICE-VERSION`="1.0.0",
`SECURITY-APPNAME`="xxxxxxxxxxxxxxxxxxx",
`GLOBAL-ID`="EBAY-US",
`RESPONSE-DATA-FORMAT`="JSON",
`REST-PAYLOAD`="",
`keywords`="harry potter",
`paginationInput.pageNumber`=i,
`paginationInput.entriesPerPage`=100))
dat <- fromJSON(content(req, as="text", encoding="UTF-8"))
map_df(dat$findItemsByKeywordsResponse$searchResult[[1]]$item, function(x) {
data_frame(ITEMID=flatten_chr(x$itemId),
TITLE=flatten_chr(x$title))
})
}
# "10" will need to be the max page number. I wasn't about to
# make 1,400 requests to ebay. I'd probably break them up into
# sets of 30 or 50 and save off temporary data frames as rdata files
# just so you don't get stuck in a situation where R crashes and you
# have to get all the data again.
srch_dat <- map_df(1:10, get_pg)
srch_dat
## Source: local data frame [1,000 x 2]
##
## ITEMID TITLE
## (chr) (chr)
## 1 371533364795 Harry Potter: Complete 8-Film Collection (DVD, 2011, 8-Disc Set)
## 2 331128976689 HOT New Harry Potter 14.5" Magical Wand Replica Cosplay In Box
## 3 131721213216 Harry Potter: Complete 8-Film Collection (DVD, 2011, 8-Disc Set)
## 4 171430021529 New Harry Potter Hermione Granger Rotating Time Turner Necklace Gold Hourglass
## 5 261597812013 Harry Potter Time Turner+GOLD Deathly Hallows Charm Pendant necklace
## 6 111883750466 Harry Potter: Complete 8-Film Collection (DVD, 2011, 8-Disc Set)
## 7 251947403227 HOT New Harry Potter 14.5" Magical Wand Replica Cosplay In Box
## 8 351113839731 Marauder's Map Hogwarts Wizarding World Harry Potter Warner Bros LIMITED **NEW**
## 9 171912724869 Harry Potter Time Turner Necklace Hermione Granger Rotating Spins Gold Hourglass
## 10 182024752232 Harry Potter : Complete 8-Film Collection (DVD, 2011, 8-Disc Set) Free Shipping
## .. ... ...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.