繁体   English   中英

在R中使用xpathSApply计算均值

[英]using xpathSApply in R to calculate mean

我在应用xpathSApply计算温度平均值时遇到困难。 可从此处http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/forecast_hour_by_hour.xml获取XML

我的R代码:

library(XML)
fileURL<-"http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/forecast_hour_by_hour.xml"
doc <- xmlTreeParse(fileURL, useInternal=TRUE)
rootNode <- xmlRoot(doc)
xmlName(rootNode)
mean(xpathSApply(rootNode, "//temperature", xmlValue))

XML如下所示

<weatherdata>
<location>
<name>Kuala Lumpur</name>
<type>Capital</type>
<country>Malaysia</country>
<timezone id="Asia/Kuala_Lumpur" utcoffsetMinutes="480"/>
<location altitude="56" latitude="3.1412" longitude="101.68653" geobase="geonames" geobaseid="1735161"/>
</location>
<credit>
<!--
In order to use the free weather data from yr no, you HAVE to display 
the following text clearly visible on your web page. The text should be a 
link to the specified URL.
-->
<!--
Please read more about our conditions and guidelines at http://om.yr.no/verdata/  English explanation at http://om.yr.no/verdata/free-weather-data/
-->
<link text="Weather forecast from yr.no, delivered by the Norwegian Meteorological Institute and the NRK" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/"/>
</credit>
<links>
<link id="xmlSource" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/forecast.xml"/>
<link id="xmlSourceHourByHour" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/forecast_hour_by_hour.xml"/>
<link id="overview" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/"/>
<link id="hourByHour" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/hour_by_hour"/>
<link id="longTermForecast" url="http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/long"/>
</links>
<meta>
<lastupdate>2015-06-26T15:40:08</lastupdate>
<nextupdate>2015-06-27T04:00:00</nextupdate>
</meta>
<sun rise="2015-06-26T07:06:55" set="2015-06-26T19:25:04"/>
<forecast>
<tabular>
<time from="2015-06-26T17:00:00" to="2015-06-26T20:00:00">
<!--
Valid from 2015-06-26T17:00:00 to 2015-06-26T20:00:00 
-->
<symbol number="1" numberEx="1" name="Clear sky" var="01d"/>
<precipitation value="0"/>
<!--  Valid at 2015-06-26T17:00:00  -->
<windDirection deg="163.0" code="SSE" name="South-southeast"/>
<windSpeed mps="2.9" name="Light breeze"/>
<temperature unit="celsius" value="31"/>
<pressure unit="hPa" value="1008.1"/>
</time>
<time from="2015-06-26T20:00:00" to="2015-06-26T23:00:00">
<!--
Valid from 2015-06-26T20:00:00 to 2015-06-26T23:00:00 
-->
<symbol number="1" numberEx="1" name="Clear sky" var="mf/01n.31"/>
<precipitation value="0"/>
<!--  Valid at 2015-06-26T20:00:00  -->
<windDirection deg="143.3" code="SE" name="Southeast"/>
<windSpeed mps="1.2" name="Light air"/>
<temperature unit="celsius" value="29"/>
<pressure unit="hPa" value="1009.4"/>
</time>
</time>
</tabular>
</forecast>
</weatherdata>'

我在这里做正确的事吗? 还是我弄错了? 很抱歉,如果这是一个重复的问题。

您有两个或三个问题:

  1. 函数xPathSApply期望将XML文档作为第一个参数。 使用xpathSApply(doc, ...)代替xpathSApply(rootNode, ...)

  2. 温度值在元素的属性中。 您可以使用xpath表达式(element / @ attribute)获得它:

     temp <- xpathSApply(doc, "//temperature/@value", as.numeric) 

    或使用xmlGetAttr函数:

     temp <- as.numeric(xpathSApply(doc, "//temperature", xmlGetAttr, "value")) 
  3. 注意这两种方法中的is.numeric调用。 您必须将数字向量与mean函数一起使用。

它是这样工作的:

library(XML)
fileURL<-"http://www.yr.no/place/Malaysia/Kuala_Lumpur/Kuala_Lumpur/forecast_hour_by_hour.xml"
doc <- xmlTreeParse(fileURL, useInternal=TRUE)
rootNode <- xmlRoot(doc)
xmlName(rootNode)
mean(xpathSApply(doc, "//temperature/@value", as.numeric))

结果如下:

[1] "weatherdata"
[1] 27.6875

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM