[英]xml document into data.frame in R
I got a problem. 我有问题 I have an xml document and I need it into a data.frame in R. so far I managed it to upload a simple xml into a data.frame using the packages
xml
and plyr
and doing 我有一个xml文档,我需要将它放到R中的data.frame中。到目前为止,我已经管理它使用
xml
和plyr
软件包将一个简单的xml上载到plyr
并且
dataframe=ldply(xmlToList("file.xml"), data.frame)
but when I run this xml: 但是当我运行这个xml:
<BusinessUnitList>
<BusinessUnit id="000000195">
<User id="897654322" firstName="Rick" lastName="Test" middleName="R" defaultLanguageName="English">
<RoleList>
<Role id="worker"/>
</RoleList>
<OrgList>
<Organization id="1111"/>
</OrgList>
<Address country="Italy"/>
<Employee badgeNumber="575757" Date="2017-01-01" DateNew="2017-01-02" birthDate="1999-01-01">
<Availability val1="5" val2="n" val3="6" HoursPerWeek="33.75" HoursBetweenShifts="10" minHoursPerWeek="00.00"/>
</Employee>
</User>
</BusinessUnit>
<BusinessUnit id="000000111">
<User id="897652222" firstName="TERI" lastName="tst2" middleName="D" defaultLanguageName="English">
<RoleList>
<Role id="worker"/>
</RoleList>
<OrgList>
<Organization id="2222"/>
</OrgList>
<Address country="Portugal"/>
<Employee badgeNumber="575757" Date="2017-02-02" DateNew="2017-02-02" birthDate="1998-01-01">
<Availability val1="5" val2="n" val3="6" HoursPerWeek="33.75" HoursBetweenShifts="10" minHoursPerWeek="00.00"/>
</Employee>
</User>
</BusinessUnit>
</BusinessUnitList>
i receive an error: Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 9, 7.
我收到一个错误:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 9, 7.
You are trying to combine a list like this 您正在尝试合并这样的列表
list(a=1:2, b=3:5)
$a
[1] 1 2
$b
[1] 3 4 5
data.frame( list(a=1:2, b=3:5))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 2, 3
I would maybe unlist the xmlToList
results and format the column names. 我可能会取消列出
xmlToList
结果并设置列名称的格式。
doc <- xmlParse("file.xml")
x <- data.frame( t( unlist(xmlToList(doc))) )
names(x) <- gsub("(..attrs)?.id$", "_id", names(x))
names(x) <- gsub(".*\\.", "", names(x))
Role_id Organization_id country val1 val2 val3 HoursPerWeek HoursBetweenShifts minHoursPerWeek badgeNumber Date DateNew birthDate User_id firstName lastName middleName defaultLanguageName BusinessUnit_id
1 worker 1111 Italy 5 n 6 33.75 10 00.00 575757 2017-01-01 2017-01-02 1999-01-01 897654322 Rick Test R English 000000195
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.