简体   繁体   English

如何在JSON对象中循环遍历JSON数组

[英]How to loop over JSON array in a JSON object

I've been trying to learn R and I have a JSON file full of single line JSON objects, and each object has an array of account data. 我一直在尝试学习R,并且我有一个包含单行JSON对象的JSON文件,并且每个对象都有一个帐户数据数组。 What I'm trying to do is parse each row, then get the JSON array out of the parsed JSON object, pull the account type and the amount. 我想做的是解析每一行,然后从解析的JSON对象中获取JSON数组,提取帐户类型和金额。 But my problem is that I don't know how best to pull just those two attributes out. 但是我的问题是我不知道如何最好地仅将这两个属性拉出来。

I've tried using the dplyr package to pull "accountHistory" out of each of my JSON lines, but I get a console error. 我尝试使用dplyr包从每个JSON行中提取“ accountHistory”,但出现控制台错误。 When I try: 当我尝试:

select(JsonAcctData, "accountHistory.type", "accountHistory.amount")

What happens is, my code only returns the last account for each row's type and amount. 发生的是,我的代码仅返回每行类型和金额的最后一个帐户。

Right now my code is writing to a csv file and I can see all the data I need, but I just want to remove the ext 现在我的代码正在写入一个csv文件,我可以看到我需要的所有数据,但是我只想删除ext

library("rjson")
library("dplyr")

parseJsonData <- function (sourceFile, outputFile) 
{
  #Get all total lines in the source file provided
  totalLines <- readLines(sourceFile)

  #Clean up old output file
  if(file.exists(outputFile)){
    file.remove(outputFile)
  }

  #Loop over each line in the sourceFile, 
  #parse the JSON and append to DataFrame
  JsonAcctData <- NULL
  for(i in 1:length(totalLines)){
    jsonValue <- fromJSON(totalLines[[i]])
    frame <- data.frame(jsonValue)
    JsonAcctData <- rbind(JsonAcctData, frame)
  }

  #Try to get filtered data
  filteredColumns <- 
    select(JsonAcctData, "accountHistory.type", "accountHistory.amount")
  print(filteredColumns)

  #Write the DataFrame to the output file in CSV format
  write.csv(JsonAcctData, file = outputFile)

} }

Test JSON File Data: 测试JSON文件数据:

{"name":"Test1", "accountHistory":[{"amount":"107.62","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyA","name":"Home Loan Account 
  6220","type":"payment","account":"11111111"}, 
  {"amount":"650.88","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyF","name":"Checking Account 
  9001","type":"payment","account":"123123123"}, 
  {"amount":"878.63","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyG","name":"Money Market Account 
  8743","type":"deposit","account":"123123123"}]}
  {"name":"Test2", "accountHistory":[{"amount":"199.29","date":"2012-02-            
  02T06:00:00.000Z","business":"CompanyB","name":"Savings Account 
  3580","type":"invoice","account":"12312312"}, 
  {"amount":"841.48","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Home Loan Account 
  5988","type":"payment","account":"123123123"}, 
  {"amount":"116.55","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Auto Loan Account 
  1794","type":"withdrawal","account":"12312313"}]}

what I would expect is to get a csv that has just the account types and the ammounts held in each account. 我希望得到的是一个csv,其中仅包含帐户类型和每个帐户中持有的金额。

Here is a way using regex (in base R ) 这是使用regex的方法(在base R

# read json 
json <- readLines('test.json', warn = FALSE)
# extract with regex
amount <- grep('\"amount\":\"\\d+\\.\\d+\"', json, value = TRUE)
amount <- as.numeric(gsub('.*amount\":\"(\\d+\\.+\\d+)\".*', '\\1', amount, perl = TRUE))
type   <- grep('\"type\":\"\\w+\"', json, value = TRUE)
type   <- gsub('.*type\":\"(\\w+)\".*', '\\1', type, perl = TRUE)
# output
data.frame(type, amount)
#         type amount
# 1    payment 107.62
# 2    payment 650.88
# 3    deposit 878.63
# 4    invoice 199.29
# 5    payment 841.48
# 6 withdrawal 116.55

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM