簡體   English   中英

xml文檔放入R中的data.frame

[英]xml document into data.frame in R

我有問題 我有一個xml文檔,我需要將它放到R中的data.frame中。到目前為止,我已經管理它使用xmlplyr軟件包將一個簡單的xml上載到plyr並且

dataframe=ldply(xmlToList("file.xml"), data.frame)

但是當我運行這個xml:

    <BusinessUnitList>
    <BusinessUnit id="000000195">
      <User id="897654322" firstName="Rick" lastName="Test" middleName="R" defaultLanguageName="English">
        <RoleList>
          <Role id="worker"/>
        </RoleList>
        <OrgList>
          <Organization id="1111"/>
        </OrgList>
        <Address country="Italy"/>
        <Employee badgeNumber="575757" Date="2017-01-01" DateNew="2017-01-02" birthDate="1999-01-01">
          <Availability val1="5" val2="n" val3="6" HoursPerWeek="33.75" HoursBetweenShifts="10" minHoursPerWeek="00.00"/>
        </Employee>
      </User>
</BusinessUnit>
    <BusinessUnit id="000000111">
      <User id="897652222" firstName="TERI" lastName="tst2" middleName="D" defaultLanguageName="English">
        <RoleList>
          <Role id="worker"/>
        </RoleList>
        <OrgList>
          <Organization id="2222"/>
        </OrgList>
        <Address country="Portugal"/>
        <Employee badgeNumber="575757" Date="2017-02-02" DateNew="2017-02-02" birthDate="1998-01-01">
          <Availability val1="5" val2="n" val3="6" HoursPerWeek="33.75" HoursBetweenShifts="10" minHoursPerWeek="00.00"/>
        </Employee>
      </User>
      </BusinessUnit>
    </BusinessUnitList>

我收到一個錯誤: Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 9, 7.

您正在嘗試合並這樣的列表

list(a=1:2, b=3:5)
$a
[1] 1 2

$b
[1] 3 4 5

data.frame( list(a=1:2, b=3:5))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 2, 3

我可能會取消列出xmlToList結果並設置列名稱的格式。

doc <- xmlParse("file.xml") 
x <- data.frame( t( unlist(xmlToList(doc))) )
names(x) <- gsub("(..attrs)?.id$", "_id", names(x))
names(x) <-  gsub(".*\\.", "", names(x))

  Role_id Organization_id country val1 val2 val3 HoursPerWeek HoursBetweenShifts minHoursPerWeek badgeNumber       Date    DateNew  birthDate   User_id firstName lastName middleName defaultLanguageName BusinessUnit_id
1  worker            1111   Italy    5    n    6        33.75                 10           00.00      575757 2017-01-01 2017-01-02 1999-01-01 897654322      Rick     Test          R             English       000000195

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM