简体   繁体   English

在R的递归函数中向data.frame添加新行

[英]Add new row to data.frame within recursive function in R

I am trying to parse XML in R using the XML package by Duncan Temple Lang. 我正在尝试使用Duncan Temple Lang的XML包解析R中的XML。 The code I have is as follows: 我的代码如下:

library(XML)
retrieveStructureInfo <- function(node, tableData) {  
    tableD <- data.frame(path = NA, node = NA, value = NA)

    for (i in 1 : xmlSize(xmlAttrs(node))) {
      tableD <- rbind(tableD, c("path", "node", "value"))  
      tableData <<- rbind(tableData, tableD)    
    }

    #children is the no. of nodes within a node
    for (i in 1 : children) {
      #recursive function call
      retrieveStructureInfo(node[[i]], tableD) 
    }
}

#parse xml document
#xmlfile is the file path
doc <- xmlParse(xmlfile)
r <- xmlRoot(doc)
tableData <- data.frame(path = NA, node = NA, value = NA)
retrieveStructureInfo(r, tableData)
tableData

I am having issues adding rows to the data.frame because it is being done in a recursive function. 我在将行添加到data.frame时遇到问题,因为它是在递归函数中完成的。 For the XML given below, only the last two attribute values are added to the data.frame ie Source="b" and Available="true". 对于以下给出的XML,仅将最后两个属性值添加到data.frame,即Source =“ b”和Available =“ true”。 I created a main table called tableData and try to update it with a local table within the function called tableD but it doesn't work. 我创建了一个名为tableData的主表,并尝试使用一个名为tableD的函数中的本地表对其进行更新,但是它不起作用。

<CATALOG>
   <PLANT>
      <COMMON Source="a" Available="false">Bloodroot</COMMON>
   </PLANT>
   <PLANT>
      <COMMON Source="b" Available="true">Columbine</COMMON>
   </PLANT>
</CATALOG>

I forgot to add that I am aiming to create a function that reads any xml (that is why I went with the idea of recursion) and gives an output: 我忘了补充一点,我的目标是创建一个读取任何xml的函数(这就是为什么我采用递归的想法)并给出输出:

                   path                 node                  value parent      type
  CATALOG/PLANT/COMMON               Source                    a    PLANT  attribute
  CATALOG/PLANT/COMMON            Available                  false  PLANT  attribute
  CATALOG/PLANT/COMMON               COMMON              Bloodroot  PLANT       text

Usually the answer to xml and recursion is to use xpath. 通常,对xml和递归的答案是使用xpath。 You may be able to create a table with a few xpath queries or one of the helper functions like xmlToList. 您也许可以用几个xpath查询或一个辅助功能(如xmlToList)创建一个表。

x<- '<CATALOG>
   <PLANT>
      <COMMON Source="a" Available="false">Bloodroot</COMMON>
   </PLANT>
   <PLANT>
      <COMMON Source="b" Available="true">Columbine</COMMON>
   </PLANT>
</CATALOG>'

doc <- xmlParse(x)

xpathSApply(doc, "//COMMON", xmlValue)
[1] "Bloodroot" "Columbine"
xpathSApply(doc, "//COMMON", xmlGetAttr, "Source")
[1] "a" "b"

y <- xmlToList(doc)
data.frame(path=names(unlist(y)),value=unlist( y) )
                           path     value
1             PLANT.COMMON.text Bloodroot
2    PLANT.COMMON..attrs.Source         a
3 PLANT.COMMON..attrs.Available     false
4             PLANT.COMMON.text Columbine
5    PLANT.COMMON..attrs.Source         b
6 PLANT.COMMON..attrs.Available      true

library(plyr)
ldply(y, data.frame)  #OR 
ldply( y, function(x) data.frame(x, names(x$COMMON$.attrs)  ) )
    .id COMMON.text COMMON..attrs names.x.COMMON..attrs.
1 PLANT   Bloodroot             a                 Source
2 PLANT   Bloodroot         false              Available
3 PLANT   Columbine             b                 Source
4 PLANT   Columbine          true              Available

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM