簡體   English   中英

用 ZE1E1D3D40573127E9ZEE028 解析 BGG xml api 中的 xml 數據

[英]Parsing xml data in BGG xml api with R

這個問題是這個問題的第二部分: How to parse xml lists and tables in R for BGG API

我想為這個表生成一個數據框:

<marketplacelistings>
  <listing>
    <listdate>Thu, 19 Jan 2006 22:08:15 +0000</listdate>
    <price currency="EUR">90.00</price>
    <condition>likenew</condition>
    <notes>Siedler von Catan / Settlers of Catan-Set (Basisspiel/basic game + Erweiterungen Die Seefahrer/ Städte und Ritter/ 5-6 Spieler / extensions The Seafarers/ Cities and Knights/ 5-6 players); 3 x gespielt (Neuwertig; lediglich alle Bestandteile in EINER der Originalboxen verstaut) / 3 times played (like new; only all items in ONE original box stored); Abgabe nur komplett / selling only all together; KEIN Festpreis (nur um überhaupt etwas einzugeben) – erwarte Angebot! / no fixed price (just to complete the entries)– make an offer; Versand weltweit zu Lasten Käufer / shipping worldwide, paid by buyer</notes>
    <link href="https://boardgamegeek.com/market/product/40605" title="marketlisting"/>
  </listing>
  <listing>
    <listdate>Mon, 29 Sep 2008 15:25:32 +0000</listdate>
    <price currency="USD">34.95</price>
    <condition>new</condition>
    <notes>Brand New Sealed Board Game. Released from MayFair Games. Price is in USD. If you wish to pay in CAD...then we will convert at market rate. Shipping is $10.95 USD. We also carry the 5-6 Player Expansion that goes with this for $24.95 USD. We have sold thousands of board games across Canada. Please feel free to buy with confidence.</notes>
    <link href="https://boardgamegeek.com/market/product/116347" title="marketlisting"/>
  </listing>

這是我不知道該怎么做的地方。 這個游戲有大約 100 個列表,我想從中創建一個數據框。 我從哪說起呢? 下面的代碼不起作用,因為它給出了 NULL 結果。

listings_df <- do.call(rbind,lapply(
  getNodeSet(xmltop, '//marketplacelistings'),
  function(x) data.frame(
    XML:::xmlAttrsToDataFrame(xmlChildren(x)),
    row.names = NULL
  )))

這個問題的完整文件在這里: https://boardgamegeek.com/xmlapi/boardgame/13&type=boardgame,boardgameexpansion,boardgameaccesory,rpgitem,rpgissue,videogame&versions=1&stats=1&videos=1&marketplace=1&comments=1

編輯:這是我的最終解決方案。 它可能並不優雅,但它確實有效。

marketplace_df_func <- function(xmltop){

 marketplace_df <- data.frame(
listdate = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//listdate"), xmlValue),
currency = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//price[@currency]"), xmlAttrs),
price = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//price"), xmlValue),
condition = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//condition"), xmlValue))

marketplace_df$listdate <- substr(marketplace_df$listdate, 1, 25)

return(marketplace_df)}

由於這個 XML 現在在元素而不是屬性中具有更多數據,因此只需運行可訪問的xmlToDataFrame而無需lapply循環:

library(XML) 

url <- "..."
doc <- xmlParse(readLines(url))

listings_df <- xmlToDataFrame(doc, nodes = getNodeSet(doc, "//listing"))

要綁定底層屬性,請使用特殊方法:

listings_df <- data.frame(
    xmlToDataFrame(doc, nodes = getNodeSet(doc, "//listing")),
    XML:::xmlAttrsToDataFrame(getNodeSet(doc, "//listing/price")),
    XML:::xmlAttrsToDataFrame(getNodeSet(doc, "//listing/link")),
    row.names = NULL
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM