简体   繁体   中英

Incorrectly parsing xml with Nokogiri in Ruby

I'm using Nokogiri to parse an XML response from last.fm. I am currently returning the information I want, but not in the format I'd like. What I get is what appears to be one Nokogiri::XML Document. What I want is a line per <track> that includes a song's title, artist, and url. Here is a sample of the XML:

<lfm status="ok">
  <toptracks metro="Beijing" page="1" perPage="50" totalPages="10" total="500">
    <track rank="1">
      <name>Rolling in the Deep</name>
      <duration>226</duration>
      <listeners>33</listeners>
      <mbid>092a88bc-af0b-4ddd-a3a1-17ad37abfccb</mbid>
      <url>
        http://www.last.fm/music/Adele/_/Rolling+in+the+Deep
      </url>
      <streamable fulltrack="0">1</streamable>
      <artist>
        <name>Adele</name>
        <mbid>1de93a63-3a9f-443a-ba8a-a43b5fe0121e</mbid>
        <url>http://www.last.fm/music/Adele</url>
      </artist>
      <image size="small">http://userserve-ak.last.fm/serve/34s/55125087.png</image>
      <image size="medium">http://userserve-ak.last.fm/serve/64s/55125087.png</image>
      <image size="large">http://userserve-ak.last.fm/serve/126/55125087.png</image>
      <image size="extralarge">
        http://userserve-ak.last.fm/serve/300x300/55125087.png
      </image>
    </track>
  </toptracks>
</lfm>

And here is the code I'm using:

doc = Nokogiri::HTML(open(url))

doc.xpath("//toptracks").each do |track|
  song_title = track.xpath("*/name").text
  song_lastfm_url = track.xpath("*/url").text
  song_artist = track.xpath("*/artist/name").text

  puts "#{song_title} - #{song_lastfm_url} - #{song_artist}"
end

As I mentioned though I'm getting all the song titles, followed by all the song urls, followed by all the song artists as one XML document.

You're not iterating through the tracks like you think you are. Try it like this:

doc.xpath('//toptracks/track').each do |track|
  song_title, song_lastfm_url, song_artist = track.xpath('./name','./url','./artist/name').map{|x| x.text.strip}
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM