简体   繁体   English

如何使用R将数据帧转换为JSON数组

[英]How to convert data frame to JSON array using R

In my current project, I am trying to read data from csv file and trying to create hierarchical JSON array based on the data from csv file in R. Sample data is shown below: 在当前项目中,我试图从csv文件中读取数据,并尝试基于R中csv文件中的数据创建分层的JSON数组。示例数据如下所示:

Added data sample data (Reduced the dataset for simplicity): 添加了数据样本数据(为简单起见,减少了数据集):

    Country   Provider   2 G Data   3 G Data    LTE     FP0   anfang0   2G  3G  FP1 anfang1
     ABC        A1          n          n         n      fp0      j      NA  NA  NA  NA
     ABC        A2          NA         NA        NA      NA      NA      j   j  fp1 n
     ABC        A3          n          n         n       fp0     j      NA  NA  NA  NA
     DEF        A7          j          j         j       fp0     n       j   j  fp1 n

Understanding of data: n stands for value is no , j stands for value is yes and NA stands for value is missing. 数据理解: n代表值是noj代表值是yesNA代表缺少值。 FP0 and FP1 represent the information about the same provider but in a different area. FP0FP1表示有关同一提供者但在不同区域中的信息。 There are two types of data in a single row ie 2 G Data, 3 G Data, LTE, FP0, anfang 0 belong to 1 group and 2G, 3G, FP1, anfang 1 belong to other group. 一行中有两种类型的数据,即2 G Data, 3 G Data, LTE, FP0, anfang 0属于一组,而2G, 3G, FP1, anfang 1属于另一组。 If all information is n ie no then we have to consider corresponding anfang0 or anfang1 value. 如果所有信息均为nno那么我们必须考虑相应的anfang0anfang1值。

The sample output is shown below (based on the above explanation): 示例输出如下所示(基于以上说明):

        {
      "ABC": {
        "fp0":[
          {
            "provider": "A1",
            "anfrage": "j"
          },
          {
            "provider": "A3",
            "anfrage": "j"
          }
        ],
        "fp1": [
          {
            "provider": "A2",
            "2G": "j",
            "3G": "j"
          }
        ]
      },
      "DEF": {
        "fp1": [
          {
            "provider": "A7",
            "2G": "j",
            "3G": "j"
          }  
        ],
        "fp0": [
          {
            "provider": "A7",
            "2G": "j",
            "3G": "j",
            "LTE": "j"
          }  
        ]       
      }  
    }

In the above json format, for each Country there should be only single json block as shown above. 在上述json格式中,对于每个Country ,只有一个json块,如上所示。 So far I tried to follow this link but couldn't find any working solution. 到目前为止,我尝试按照此链接进行操作,但是找不到任何有效的解决方案。

for(i in 1:nrow(data)){
   a=c(a,jsonlite::toJSON(list(list('fp0' = 
   list("provider"=data$Provider[i],"2g"=data$`2 G Data`[i],"3g"=data$`3 G 
   Data`[i],"LTE"=data$LTE[i]))), pretty = TRUE))
}
toJSON(a, pretty = TRUE, auto_unbox = TRUE)

Kindly let me know in case you need more clarity. 如果您需要进一步的说明,请告诉我。

One of the approach could be 方法之一可能是

library(dplyr)
library(jsonlite)

#data pre-processing (bind different areas' data in row)
df1 <- df[, 1:7] %>%                          #dataframe having data for one area - i.e. fp0
  na.omit() %>%
  `colnames<-`(c("country", "provider", "2G", "3G", "LTE", "fp", "anfang")) %>%
  bind_rows(
    df[, c(1:2, 8:ncol(df))] %>%              #dataframe having data for another area - i.e. fp1
      na.omit() %>%
      `colnames<-`(c("country", "provider", "2G", "3G", "fp", "anfang"))
    )
df1[df1 == 'n'] <- NA                         #convert all "n" to NA as we are not concerened about it in the final output

#convert processed dataframe to a list
dfList <- lapply(split(df1, df1$country), 
                 function(x) split(x[, c("provider", "2G", "3G", "LTE", "anfang")], x$fp))

#final result (convert list to JSON)
json_out <- toJSON(dfList, auto_unbox = T)

which gives 这使

> json_out
{"ABC":{"fp0":[{"provider":"A1","anfang":"j"},{"provider":"A3","anfang":"j"}],"fp1":[{"provider":"A2","2G":"j","3G":"j"}]},"DEF":{"fp0":[{"provider":"A7","2G":"j","3G":"j","LTE":"j"}],"fp1":[{"provider":"A7","2G":"j","3G":"j"}]}}


Sample data: 样本数据:

df <- structure(list(Country = c("ABC", "ABC", "ABC", "DEF"), Provider = c("A1", 
"A2", "A3", "A7"), `2 G Data` = c("n", NA, "n", "j"), `3 G Data` = c("n", 
NA, "n", "j"), LTE = c("n", NA, "n", "j"), FP0 = c("fp0", NA, 
"fp0", "fp0"), anfang0 = c("j", NA, "j", "n"), `2G` = c(NA, "j", 
NA, "j"), `3G` = c(NA, "j", NA, "j"), FP1 = c(NA, "fp1", NA, 
"fp1"), anfang1 = c(NA, "n", NA, "n")), .Names = c("Country", 
"Provider", "2 G Data", "3 G Data", "LTE", "FP0", "anfang0", 
"2G", "3G", "FP1", "anfang1"), class = "data.frame", row.names = c(NA, 
-4L))

#  Country Provider 2 G Data 3 G Data  LTE  FP0 anfang0   2G   3G  FP1 anfang1
#1     ABC       A1        n        n    n  fp0       j <NA> <NA> <NA>    <NA>
#2     ABC       A2     <NA>     <NA> <NA> <NA>    <NA>    j    j  fp1       n
#3     ABC       A3        n        n    n  fp0       j <NA> <NA> <NA>    <NA>
#4     DEF       A7        j        j    j  fp0       n    j    j  fp1       n

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM