[英]How do I scrape a website and output data to xml file with Nokogiri?
I've been trying to scrape data using Nokogiri and HTTParty and can scrape data off a website successfully and print it to the console but I can't work out how to output the data to an xml file in the repo. 我一直在尝试使用Nokogiri和HTTParty抓取数据,并且可以成功地从网站上抓取数据并将其打印到控制台,但是我不知道如何在回购中将数据输出到xml文件。
Right now the code looks like this: 现在,代码如下所示:
class Scraper
attr_accessor :parse_page
def initialize
doc = HTTParty.get("https://store.nike.com/gb/en_gb/pw/mens-nikeid-lifestyle-shoes/1k9Z7puZoneZoi3?ref=https%253A%252F%252Fwww.google.com%252F")
@parse_page ||= Nokogiri::HTML(doc)
end
def get_names
item_container.css(".product-display-name").css("p").children.map { |name| name.text }.compact
end
def get_prices
item_container.css(".product-price").css("span.local").children.map { |price| price.text }.compact
end
private
def item_container
parse_page.css(".grid-item-info")
end
scraper = Scraper.new
names = scraper.get_names
prices = scraper.get_prices
(0...prices.size).each do |index|
puts " - - - Index #{index + 1} - - -"
puts "Name: #{names[index]} | Price: #{prices[index]}"
end
end
I've tried changing the .each method to include a File.write() but all it ever does is write the last line of the output into the xml file. 我尝试更改.each方法以包括File.write(),但是它所做的全部就是将输出的最后一行写入xml文件。 I would appreciate any insight as to how to parse the data correctly, I am new to scraping.
对于如何正确解析数据的任何见解,我将不胜感激。
I've tried changing the .each method to include a File.write() but all it ever does is write the last line of the output into the xml file.
我尝试更改.each方法以包括File.write(),但是它所做的全部就是将输出的最后一行写入xml文件。
Is the File.write
method inside the each
loop? each
循环中each
File.write
方法吗? I guess what's happening here is You are overwriting the file on every iteration and that's why you are seeing only the last line. 我猜这里正在发生的事情是您在每次迭代中都覆盖文件,这就是为什么只看到最后一行的原因。
Try putting the each
loop inside the block of the File.open
method like: 尝试将
each
循环放入File.open
方法的块中,例如:
File.open(yourfile, 'w') do |file|
(0...prices.size).each do |index|
file.write("your text")
end
end
I also recommend reading about the Nokogiri::XML::Builder and then saving it's output to the file. 我还建议阅读有关Nokogiri :: XML :: Builder的信息 ,然后将其输出保存到文件中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.