我如何将json.rows文件加载到具有多个不一致的嵌套数据的R中？

Question

I have 2 json.rows files with multiple nested data. 我有2个带有多个嵌套数据的json.rows文件。 I tried the below code to convert it into a dataframe and it worked for the first file. 我尝试了下面的代码将其转换为数据框，并且适用于第一个文件。

tl <- function(e) { if (is.null(e)) return(NULL); ret <- typeof(e); if (ret == 'list' && !is.null(names(e))) ret <- list(type='namedlist') else ret <- list(type=ret,len=length(e)); ret; };
mkcsv <- function(v) paste0(collapse=',',v);
keyListToStr <- function(keyList) paste0(collapse='','/',sapply(keyList,function(key) if (is.null(key)) '*' else paste0(collapse=',',key)));

extractLevelColumns <- function(
  nodes, ## current level node selection
  ..., ## additional arguments to data.frame()
  keyList=list(), ## current key path under main list
  sep=NULL, ## optional string separator on which to join multi-element vectors; if NULL, will leave as separate columns
  mkname=function(keyList,maxLen) paste0(collapse='.',if (is.null(sep) && maxLen == 1L) keyList[-length(keyList)] else keyList) ## name builder from current keyList and character vector max length across node level; default to dot-separated keys, and remove last index component for scalars
) {
  cat(sprintf('extractLevelColumns(): %s\n',keyListToStr(keyList)));
  if (length(nodes) == 0L) return(list()); ## handle corner case of empty main list
  tlList <- lapply(nodes,tl);
  typeList <- do.call(c,lapply(tlList,`[[`,'type'));
  if (length(unique(typeList)) != 1L) stop(sprintf('error: inconsistent types (%s) at %s.',mkcsv(typeList),keyListToStr(keyList)));
  type <- typeList[1L];
  if (type == 'namedlist') { ## hash; recurse
    allKeys <- unique(do.call(c,lapply(nodes,names)));
    ret <- do.call(c,lapply(allKeys,function(key) extractLevelColumns(lapply(nodes,`[[`,key),...,keyList=c(keyList,key),sep=sep,mkname=mkname)));
  } else if (type == 'list') { ## array; recurse
    lenList <- do.call(c,lapply(tlList,`[[`,'len'));
    maxLen <- max(lenList,na.rm=T);
    allIndexes <- seq_len(maxLen);
    ret <- do.call(c,lapply(allIndexes,function(index) extractLevelColumns(lapply(nodes,function(node) if (length(node) < index) NULL else node[[index]]),...,keyList=c(keyList,index),sep=sep,mkname=mkname))); ## must be careful to translate out-of-bounds to NULL; happens automatically with string keys, but not with integer indexes
  } else if (type%in%c('raw','logical','integer','double','complex','character')) { ## atomic leaf node; build column
    lenList <- do.call(c,lapply(tlList,`[[`,'len'));
    maxLen <- max(lenList,na.rm=T);
    if (is.null(sep)) {
      ret <- lapply(seq_len(maxLen),function(i) setNames(data.frame(sapply(nodes,function(node) if (length(node) < i) NA else node[[i]]),...),mkname(c(keyList,i),maxLen)));
    } else {
      ## keep original type if maxLen is 1, IOW don't stringify
      ret <- list(setNames(data.frame(sapply(nodes,function(node) if (length(node) == 0L) NA else if (maxLen == 1L) node else paste(collapse=sep,node)),...),mkname(keyList,maxLen)));
    }; ## end if
  } else stop(sprintf('error: unsupported type %s at %s.',type,keyListToStr(keyList)));
  if (is.null(ret)) ret <- list(); ## handle corner case of exclusively empty sublists
  ret;
}; ## end extractLevelColumns()
## simple interface function
flattenList <- function(mainList,...) do.call(cbind,extractLevelColumns(mainList,...));

but when i tried using the above function for my second file, I kept getting an error which said 但是，当我尝试对第二个文件使用上述功能时，却不断收到错误消息

 Error in extractLevelColumns(lapply(nodes, `[[`, key), ..., keyList = c(keyList,  : 
  error: inconsistent types (character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,cha

Here are a few sample images of the rows in my json file. 这是我的json文件中行的一些示例图像。 The columns are very inconsistent. 列非常不一致。

https://i.stack.imgur.com/ZKgKk.png https://i.stack.imgur.com/f3kNS.png https://i.stack.imgur.com/ZKgKk.png https://i.stack.imgur.com/f3kNS.png

Answer 1

I know it's an old question but I recently faced a similar error while working with a nested list. 我知道这是一个老问题，但是最近在处理嵌套列表时遇到了类似的错误。 Your error is because the function doesn't support the type inconsistencies between parallel nodes. 您的错误是因为该函数不支持并行节点之间的类型不一致。 So, one or more of your nodes have non-character type elements - either NULL or a list. 因此，您的一个或多个节点具有非字符类型的元素-NULL或列表。

If it's NULL, you can convert the NULL to "NA" and it should work fine. 如果为NULL，则可以将NULL转换为“ NA”，并且应该可以正常工作。 If it's a list, unfortunately I couldn't make it work without throwing away information. 如果是列表，很遗憾，如果不丢弃信息，我将无法使它生效。 I removed the node with type list and it worked. 我删除了具有类型列表的节点，它正常工作。

我如何将json.rows文件加载到具有多个不一致的嵌套数据的R中？

问题描述

1 个解决方案

解决方案1
0 2017-11-04 18:52:49

我如何将json.rows文件加载到具有多个不一致的嵌套数据的R中？

问题描述

1 个解决方案

解决方案1 0 2017-11-04 18:52:49

解决方案1
0 2017-11-04 18:52:49