简体   繁体   English

你如何将 xml 文件转换为 R 中的数据框

[英]how do you convert xml file into data frame in R

I am trying to parse this xml and place it on data frame form:我正在尝试解析此 xml 并将其放在数据框表单上:

file content looks like this:文件内容如下所示:

 <?xml version="1.0" encoding="utf-8" ?> 
- <dashboardreport name="Incident_Rules" version="7.2.5.1022" reportdate="2019-02-20T14:45:57.352-05:00" description="">
- <source name="app1">
- <filters summary="last 30 minutes (auto)">
  <filter>tf:DiagnoseTimeframe?1550690157352:1550691957352</filter> 
  </filters>
  </source>
- <reportheader>
- <reportdetails>
  <user>user1</user> 
  </reportdetails>
  </reportheader>
- <data>
- <incidentchartdashlet name="Incident Chart" description="">
- <incidentchartrecords structuretype="tree">
  <incidentchartrecord rule="Database Exception" systemprofile="app1" /> 
  <incidentchartrecord rule="Response time greater than 30 minutes" systemprofile="app1" /> 
  <incidentchartrecord rule="JVM Heap Utilization > 90%" systemprofile="app1" /> 
  </incidentchartrecords>
  </incidentchartdashlet>
  </data>
  </dashboardreport>

The data frame needs to be like this:数据框需要是这样的:

Source Name      Rule
App1         Database Exception
App1         Response time greater than 30 minutes
App1         JVM Heap Utilization > 90%

Need to extract "Source name" and "incidentchartrecord rule".需要提取“Source name”和“incidentchartrecord rule”。 I have tried something like this:我试过这样的事情:

library("XML")
doc <- read_xml(file)
  dat<-xml_find_all(doc, ".//incidentchartrecord") %>%
    map_df(function(x) {
      xml_find_all(x, ".//incidentchartrecord") %>%
        map_df(~as.list(xml_attrs(.))) %>%
        select(rule) %>%
        mutate(node=xml_attr(x, "incidentchartrecord"))
    })

Any ideas?有任何想法吗?

Here's an approach that works.这是一种有效的方法。 I used xml2 , instead;我改用xml2 that's where the xml_find_all & xml_attr functions are found.这就是xml_find_allxml_attr函数所在的xml_find_all

library(xml2)
doc <- read_xml("test.xml")
source <- xml_attr(xml_find_all(doc,".//source"), "name")
rules <- xml_attr(xml_find_all(doc, ".//incidentchartrecord"), "rule")
df <- data.frame("Source.Name" = source, Rule=rules, stringsAsFactors=F)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM