简体   繁体   中英

R data.frame to JSON with child nodes / hierarchical

I am trying to write a data.frame from R into a JSON file, but in a hierarchical structure with child nodes within them. I found examples and JSONIO but I wasn't able to apply it to my case.

This is the data.frame in R

> DF
   Date_by_Month    CCG Year Month refYear      name OC_5a OC_5b OC_5c 
1     2010-01-01 MyTown 2010    01    2009 2009/2010     0    15    27 
2     2010-02-01 MyTown 2010    02    2009 2009/2010     1    14    22 
3     2010-03-01 MyTown 2010    03    2009 2009/2010     1     6    10 
4     2010-04-01 MyTown 2010    04    2010 2010/2011     0    10    10 
5     2010-05-01 MyTown 2010    05    2010 2010/2011     1    16     7 
6     2010-06-01 MyTown 2010    06    2010 2010/2011     0    13    25 

In addtion to writing the data by month, I would also like to create an aggregate child, the 'yearly' one, which holds the sum (for example) of all the months that fall in this year. This is how I would like the JSON file to look like:

[
    {
     "ccg":"MyTown",
     "data":[
            {"period":"yearly",
             "scores":[
                {"name":"2009/2010","refYear":"2009","OC_5a":2, "OC_5b": 35, "OC_5c": 59},
                {"name":"2010/2011","refYear":"2010","OC_5a":1, "OC_5b": 39, "OC_5c": 42},
             ]
             },
            {"period":"monthly",
             "scores":[
                {"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 15, "OC_5c": 27},
                {"name":"2009/2010","refYear":"2009","month":"02","year":"2010","OC_5a":1, "OC_5b": 14, "OC_5c": 22},
                {"name":"2009/2010","refYear":"2009","month":"03","year":"2010","OC_5a":1, "OC_5b": 6, "OC_5c": 10},
                {"name":"2009/2010","refYear":"2009","month":"04","year":"2010","OC_5a":0, "OC_5b": 10, "OC_5c": 10},
                {"name":"2009/2010","refYear":"2009","month":"05","year":"2010","OC_5a":1, "OC_5b": 16, "OC_5c": 7},
                {"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 13, "OC_5c": 25}
                ]
             }
            ]
    },
]

Thank you so much for your help!

Expanding on my comment:

The jsonlite package has a lot of features, but what you're describing doesn't really map to a data frame anymore so I doubt any canned routine has this functionality. Your best bet is probably to convert the data frame to a more general list (FYI data frames are stored internally as lists of columns) with a structure that matches the structure of the JSON exactly, then just use the converter to translate

This is complicated in general but in your case should be fairly simple. The list will be structured exactly like the JSON data:

list(
  list(
    ccg = "Town1",
    data = list(
      list(
        period = "yearly",
        scores = yearly_data_frame_town1
      ),
      list(
        period = "monthly",
        scores = monthly_data_frame_town1
      )
    )
  ),
  list(
    ccg = "Town2",
    data = list(
      list(
        period = "yearly",
        scores = yearly_data_frame_town2
      ),
      list(
        period = "monthly",
        scores = monthly_data_frame_town2
      )
    )
  )
)

Constructing this list should be a straightforward case of looping over unique(DF$CCG) and using aggregate at each step, to construct the yearly data.

If you need performance, look to either the data.table or dplyr packages to do the looping and aggregating all at once. The former is flexible and performant but a little esoteric. The latter has relatively easy syntax and is similarly performant, but is designed specifically around building pipelines for data frames so it might take some hacking to get it to produce the right output format.

Looks like ssdecontrol has you covered... but here's my solution. Need to loop over unique CCG and Years to create the entire data set...

df <- read.table(textConnection("Date_by_Month    CCG Year Month refYear      name OC_5a OC_5b OC_5c 
2010-01-01 MyTown 2010    01    2009 2009/2010     0    15    27 
2010-02-01 MyTown 2010    02    2009 2009/2010     1    14    22 
2010-03-01 MyTown 2010    03    2009 2009/2010     1     6    10 
2010-04-01 MyTown 2010    04    2010 2010/2011     0    10    10 
2010-05-01 MyTown 2010    05    2010 2010/2011     1    16     7 
2010-06-01 MyTown 2010    06    2010 2010/2011     0    13    25"), stringsAsFactors=F, header=T)


library(RJSONIO)
to_list <- function(ccg, year){
  df_monthly <- subset(df, CCG==ccg & Year==year)
  df_yearly <- aggregate(df[,c("OC_5a", "OC_5b", "OC_5c")] ,df[,c("name", "refYear")], sum)
  l <- list("ccg"=ccg, 
            data=list(list("period" = "yearly",
                      "scores" = as.list(df_yearly)
                      ),
                      list("period" = "monthly",
                           "scores" = as.list(df[,c("name", "refYear", "OC_5a", "OC_5b", "OC_5c")])
                      )
            )
       )
  return(l)
}
toJSON(to_list("MyTown", "2010"), pretty=T)

Which returns this:

{
    "ccg" : "MyTown",
    "data" : [
        {
            "period" : "yearly",
            "scores" : {
                "name" : [
                    "2009/2010",
                    "2010/2011"
                ],
                "refYear" : [
                    2009,
                    2010
                ],
                "OC_5a" : [
                    2,
                    1
                ],
                "OC_5b" : [
                    35,
                    39
                ],
                "OC_5c" : [
                    59,
                    42
                ]
            }
        },
        {
            "period" : "monthly",
            "scores" : {
                "name" : [
                    "2009/2010",
                    "2009/2010",
                    "2009/2010",
                    "2010/2011",
                    "2010/2011",
                    "2010/2011"
                ],
                "refYear" : [
                    2009,
                    2009,
                    2009,
                    2010,
                    2010,
                    2010
                ],
                "OC_5a" : [
                    0,
                    1,
                    1,
                    0,
                    1,
                    0
                ],
                "OC_5b" : [
                    15,
                    14,
                    6,
                    10,
                    16,
                    13
                ],
                "OC_5c" : [
                    27,
                    22,
                    10,
                    10,
                    7,
                    25
                ]
            }
        }
    ]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM