[英]How to i load a json.rows file into R which has multiple inconsistent nested data?
I have 2 json.rows files with multiple nested data. 我有2个带有多个嵌套数据的json.rows文件。 I tried the below code to convert it into a dataframe and it worked for the first file. 我尝试了下面的代码将其转换为数据框,并且适用于第一个文件。
tl <- function(e) { if (is.null(e)) return(NULL); ret <- typeof(e); if (ret == 'list' && !is.null(names(e))) ret <- list(type='namedlist') else ret <- list(type=ret,len=length(e)); ret; };
mkcsv <- function(v) paste0(collapse=',',v);
keyListToStr <- function(keyList) paste0(collapse='','/',sapply(keyList,function(key) if (is.null(key)) '*' else paste0(collapse=',',key)));
extractLevelColumns <- function(
nodes, ## current level node selection
..., ## additional arguments to data.frame()
keyList=list(), ## current key path under main list
sep=NULL, ## optional string separator on which to join multi-element vectors; if NULL, will leave as separate columns
mkname=function(keyList,maxLen) paste0(collapse='.',if (is.null(sep) && maxLen == 1L) keyList[-length(keyList)] else keyList) ## name builder from current keyList and character vector max length across node level; default to dot-separated keys, and remove last index component for scalars
) {
cat(sprintf('extractLevelColumns(): %s\n',keyListToStr(keyList)));
if (length(nodes) == 0L) return(list()); ## handle corner case of empty main list
tlList <- lapply(nodes,tl);
typeList <- do.call(c,lapply(tlList,`[[`,'type'));
if (length(unique(typeList)) != 1L) stop(sprintf('error: inconsistent types (%s) at %s.',mkcsv(typeList),keyListToStr(keyList)));
type <- typeList[1L];
if (type == 'namedlist') { ## hash; recurse
allKeys <- unique(do.call(c,lapply(nodes,names)));
ret <- do.call(c,lapply(allKeys,function(key) extractLevelColumns(lapply(nodes,`[[`,key),...,keyList=c(keyList,key),sep=sep,mkname=mkname)));
} else if (type == 'list') { ## array; recurse
lenList <- do.call(c,lapply(tlList,`[[`,'len'));
maxLen <- max(lenList,na.rm=T);
allIndexes <- seq_len(maxLen);
ret <- do.call(c,lapply(allIndexes,function(index) extractLevelColumns(lapply(nodes,function(node) if (length(node) < index) NULL else node[[index]]),...,keyList=c(keyList,index),sep=sep,mkname=mkname))); ## must be careful to translate out-of-bounds to NULL; happens automatically with string keys, but not with integer indexes
} else if (type%in%c('raw','logical','integer','double','complex','character')) { ## atomic leaf node; build column
lenList <- do.call(c,lapply(tlList,`[[`,'len'));
maxLen <- max(lenList,na.rm=T);
if (is.null(sep)) {
ret <- lapply(seq_len(maxLen),function(i) setNames(data.frame(sapply(nodes,function(node) if (length(node) < i) NA else node[[i]]),...),mkname(c(keyList,i),maxLen)));
} else {
## keep original type if maxLen is 1, IOW don't stringify
ret <- list(setNames(data.frame(sapply(nodes,function(node) if (length(node) == 0L) NA else if (maxLen == 1L) node else paste(collapse=sep,node)),...),mkname(keyList,maxLen)));
}; ## end if
} else stop(sprintf('error: unsupported type %s at %s.',type,keyListToStr(keyList)));
if (is.null(ret)) ret <- list(); ## handle corner case of exclusively empty sublists
ret;
}; ## end extractLevelColumns()
## simple interface function
flattenList <- function(mainList,...) do.call(cbind,extractLevelColumns(mainList,...));
but when i tried using the above function for my second file, I kept getting an error which said 但是,当我尝试对第二个文件使用上述功能时,却不断收到错误消息
Error in extractLevelColumns(lapply(nodes, `[[`, key), ..., keyList = c(keyList, :
error: inconsistent types (character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,character,cha
Here are a few sample images of the rows in my json file. 这是我的json文件中行的一些示例图像。 The columns are very inconsistent. 列非常不一致。
https://i.stack.imgur.com/ZKgKk.png https://i.stack.imgur.com/f3kNS.png https://i.stack.imgur.com/ZKgKk.png https://i.stack.imgur.com/f3kNS.png
I know it's an old question but I recently faced a similar error while working with a nested list. 我知道这是一个老问题,但是最近在处理嵌套列表时遇到了类似的错误。 Your error is because the function doesn't support the type inconsistencies between parallel nodes. 您的错误是因为该函数不支持并行节点之间的类型不一致。 So, one or more of your nodes have non-character type elements - either NULL or a list. 因此,您的一个或多个节点具有非字符类型的元素-NULL或列表。
If it's NULL, you can convert the NULL to "NA" and it should work fine. 如果为NULL,则可以将NULL转换为“ NA”,并且应该可以正常工作。 If it's a list, unfortunately I couldn't make it work without throwing away information. 如果是列表,很遗憾,如果不丢弃信息,我将无法使它生效。 I removed the node with type list and it worked. 我删除了具有类型列表的节点,它正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.