簡體   English   中英

將JSON文件轉換為R中的數據幀

[英]Convert json file to dataframe in R

我是將json文件轉換為dataframe時R面臨的新問題。 我有如下所示的json文件:

json_file = '[{"id": "abc", "model": "honda", "date": "20190604", "cols": {"action": 15, "values": 18, "not": 29}},
  {"id": "abc", "model": "honda", "date": "20190604", "cols": {"hello": 14, "hi": 85, "wow": 14}},
  {"id": "mno", "model": "ford", "date": "20190604", "cols": {"yesterday": 21, "today": 21, "tomorrow": 29}},
  {"id": "mno", "model": "ford", "date": "20190604", "cols": {"docs": 25, "ok": 87, "none": 42}}]'

我想將上述json文件轉換為以下格式的數據框:

預期結果

df = 
id  model      date  cols  values_cols
abc honda  20190604 action   15   
abc honda  20190604 values   18 
abc honda  20190604 not      29 
abc honda  20190604 hello    14 
abc honda  20190604 hi       85 
abc honda  20190604 wow      14 
mno ford  20190604 yesterday 21   
mno ford  20190604 today     21 
mno ford  20190604 tomorrow  29 
mno ford  20190604 docs      25 
mno ford  20190604 ok        87 

我的結果

    id model     date cols id.1 model.1   date.1 cols.1 id.2 model.2   date.2 cols.2 id.3 model.3   date.3 cols.3
action abc honda 20190604   15  abc   honda 20190604     14  mno    ford 20190604     21  mno    ford 20190604     25
values abc honda 20190604   18  abc   honda 20190604     85  mno    ford 20190604     21  mno    ford 20190604     87
not    abc honda 20190604   29  abc   honda 20190604     14  mno    ford 20190604     29  mno    ford 20190604     42
It's not correct, as it is taking as index.

我的解決方案:

require(RJSONIO)
df = fromJSON(json_file)

使用jsonlite::fromJSON讀取數據時的問題是最后一列是數據幀,而不是原子向量。

tmp <- jsonlite::fromJSON(json_file)
str(tmp)
#'data.frame':   4 obs. of  4 variables:
# $ id   : chr  "abc" "abc" "mno" "mno"
# $ model: chr  "honda" "honda" "ford" "ford"
# $ date : chr  "20190604" "20190604" "20190604" "20190604"
# $ cols :'data.frame':  4 obs. of  12 variables:
#  ..$ action   : int  15 NA NA NA
#  ..$ values   : int  18 NA NA NA
#  ..$ not      : int  29 NA NA NA
#  ..$ hello    : int  NA 14 NA NA
#  ..$ hi       : int  NA 85 NA NA
#  ..$ wow      : int  NA 14 NA NA
#  ..$ yesterday: int  NA NA 21 NA
#  ..$ today    : int  NA NA 21 NA
#  ..$ tomorrow : int  NA NA 29 NA
#  ..$ docs     : int  NA NA NA 25
#  ..$ ok       : int  NA NA NA 87
#  ..$ none     : int  NA NA NA 42

因此,在將數據從寬格式 cbind 為長格式之前,必須將最后一列與其他三列cbind

tmp <- cbind(tmp[-4], tmp[[4]])
df1 <- reshape2::melt(tmp, id.vars = c("id", "model", "date"))
names(df1)[4:5] <- c("cols", "values_cols")
df1 <- df1[complete.cases(df1), ]
row.names(df1) <- NULL

df1
#    id model     date      cols values_cols
#1  abc honda 20190604    action          15
#2  abc honda 20190604    values          18
#3  abc honda 20190604       not          29
#4  abc honda 20190604     hello          14
#5  abc honda 20190604        hi          85
#6  abc honda 20190604       wow          14
#7  mno  ford 20190604 yesterday          21
#8  mno  ford 20190604     today          21
#9  mno  ford 20190604  tomorrow          29
#10 mno  ford 20190604      docs          25
#11 mno  ford 20190604        ok          87
#12 mno  ford 20190604      none          42

現在清理.GlobalEnv

rm(tmp)    # no longer needed.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM