[英]Convert Nested JSON Object into data Frame in R
I am fetching the data from Twitter API. 我正在从Twitter API获取数据。 Converting a Data from JSON object to Data Frame and load into Data Warehouse.
将数据从JSON对象转换为数据框架并加载到数据仓库中。 Find below input and code snippet.
在下面找到输入和代码段。
I am very new to R Programming. 我是R编程的新手。
stats_campaign.data <- content(stats_campaign.request)
print(stats_campaign.data)
O/P: O / P:
`{
"data_type": [ "stats" ],
"time_series_length": [ 1 ],
"data": [
{
"id": [ "XXXXX" ],
"id_data": [
{
"segment": {},
"metrics": {
"impressions": {},
"tweets_send": {},
"qualified_impressions": {},
"follows": {},
"app_clicks": {},
"retweets": {},
"likes": {},
"engagements": {},
"clicks": {},
"card_engagements": {},
"replies": {},
"url_clicks": {},
"carousel_swipes": {}
}
}
]
},
{
"id": [ "XXXX1" ],
"id_data": [
{
"segment": {},
"metrics": {
"impressions": {},
"tweets_send": {},
"qualified_impressions": {},
"follows": {},
"app_clicks": {},
"retweets": {},
"likes": {},
"engagements": {},
"clicks": {},
"card_engagements": {},
"replies": {},
"url_clicks": {},
"carousel_swipes": {}
}
}
]
},`
When I am reading this JSON value , 当我读取此JSON值时,
stats_json_file <- sprintf("P:/R Repos/R
Applications/TwitterAPIData/stats_test_data-%s.json", TODAY)
jsonlite::fromJSON(stats_json_file)
**Result :**
id id_data
1 5wcaz NULL
2 5ub2u NULL
3 5wb8x NULL
4 5wb1j NULL
5 5yqwj NULL
6 5pq5i NULL
7 5u197 NULL
8 5z2js NULL
9 6fqh0 333250, 4, 9, 19, 111, 3189, 3156, 5, 1091
10 5tvr1 NULL
11 5yqw4 NULL
12 5qqps NULL
13 5yqvw NULL
14 5ygom NULL
15 5nc88 NULL
16 5yg94 NULL
17 65t9e NULL
18 5peck NULL
19 63pg1 247283, 17, 22, 35, 297, 5514, 5450, 6, 2971
20 6cdvy 156705, 1, 2, 6, 112, 10933, 605, 170
From my JSON file I want Id and whole "metrics": {
"impressions": {},
"tweets_send": {},
"qualified_impressions": {},
"follows": {},
"app_clicks": {},
"retweets": {},
"likes": {},
"engagements": {},
"clicks": {},
"card_engagements": {},
"replies": {},
"url_clicks": {},
"carousel_swipes": {}
}
and convert to Data Frame to load into Data Base. Plzz Help..!
How can I parsed this JSON Object. 如何解析此JSON对象。 I want to retrieve Id & whole Metrics object.
我想检索ID和整个Metrics对象。 Then want to convert into Data Frame to load into SQL Table.
然后要转换为数据框以加载到SQL表中。
To read the multiple Id's & Metrics value I used below code, 要读取我在下面的代码中使用的多个ID和指标值,
`test <- list()
for(i in 1:len)
{ test <- unlist(stats_campaign.data$data[[i]])
print(test)}`
**Output:**
id
"5wcaz"
id
"5ub2u"
id
"5wb8x"
id
"5wb1j"
id
"5yqwj"
id
"5pq5i"
id
"5u197"
id
"5z2js"
id
"5tvr1"
id
"5yqw4"
id
"5qqps"
id
"5yqvw"
id
"5ygom"
id
"5nc88"
id
"5yg94"
id
"65t9e"
id
"5peck"
id id_data.metrics.impressions
"63pg1" "133227"
id_data.metrics.tweets_send id_data.metrics.follows
"10" "9"
id_data.metrics.retweets id_data.metrics.likes
"17" "96"
id_data.metrics.engagements id_data.metrics.clicks
"2165" "2134"
id_data.metrics.replies id_data.metrics.url_clicks
"5" "1204"
id id_data.metrics.impressions
"6cdvy" "176164"
id_data.metrics.tweets_send id_data.metrics.retweets
"2" "10"
id_data.metrics.likes id_data.metrics.engagements
"121" "9708"
id_data.metrics.clicks id_data.metrics.url_clicks
"620" "160"
Within a for I have to used list or something else to append the value each time, how can I do that ..?? 在for中,我每次都必须使用列表或其他方式附加值,我该怎么做.. ?? Am I using a right Approach.??
我在使用正确的方法吗? Is there any alternative way I can parsed nested JSON object and directly put into Data Frame..?
有什么其他方法可以解析嵌套的JSON对象并直接放入Data Frame ..?
Please Help..! 请帮忙..! Thanks In Advance..!
提前致谢..!
As mentioned in the comments, a bit more information about what output you are looking for would be helpful. 如评论中所述,有关您要查找的输出的更多信息会有所帮助。 In any case, I am hopeful that the following will provide a helpful direction.
无论如何,我希望以下内容能提供有益的指导。 The
tidyjson
README provides a bit of helpful overview. tidyjson
自述文件提供了一些有用的概述。
Unfortunately, the lack of data in your JSON object makes it difficult to illustrate what might be present in your data (what to expect in the null objects), and I am having difficulty determining what part of the Twitter API you are looking at. 不幸的是,由于JSON对象中缺少数据,因此很难说明数据中可能存在的内容(空对象中有什么期望),而且我很难确定要查看的Twitter API的哪一部分。
tidyjson
gives you the ability to produce a consistent data.frame
output, even when you have no data, though! tidyjson
使您能够生成一致的data.frame
输出,即使没有数据也可以! The key verbs are gather
and spread
, much like tidyr
, but with JSON flavor. 关键动词是
gather
和spread
,很像tidyr
,但具有JSON风格。
str <- "{\"data_type\":[\"stats\"],\"time_series_length\":[1],\"data\":[{\"id\":[\"XXXXX\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]},{\"id\":[\"XXXX1\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]}]} "
library(dplyr)
library(tidyjson)
prep <- as.tbl_json(str) %>% enter_object("data") %>% gather_array("objid")
p1 <- prep %>% enter_object("id") %>%
gather_array("idnum") %>% append_values_string("id")
p2 <- prep %>% enter_object("id_data") %>% gather_array("datanum") %>%
enter_object("metrics") %>%
spread_values(
impressions = jstring("impressions", "value")
, tweets_send = jnumber("tweets_send", "somekey")
)
p1 %>% tbl_df() %>% left_join(p2 %>% tbl_df(), by = c("document.id", "objid"))
#> # A tibble: 2 x 7
#> document.id objid idnum id datanum impressions tweets_send
#> <int> <int> <int> <chr> <int> <chr> <dbl>
#> 1 1 1 1 XXXXX 1 <NA> NA
#> 2 1 2 1 XXXX1 1 <NA> NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.