简体   繁体   中英

Ruby Reading file contents and pushing selected criteria into an array within an array

I want to make a simple Ruby program that can read the contents of a single html page and output two pieces of info into an array.

For instance, this is the webpage: http://www.trulia.com/real_estate/Cambridge-Massachusetts/

I want my output to be:

output = [ [Mid-Cambridge, $642,126],
[North Cambridge, $602,100,]
[East Cambridge, $611,436]
[Neighborhood Nine, $1,068,284]
[West Cambridge, $1,577,444] ]

I was thinking of doing something like:

File.read(filename).include?(each_neighborhood)

And from there, push each neighborhood and the price nearest to it in the html file into an array together, rinse and repeat. But I feel like this might not be the most efficient method, and I am not sure how to achieve it either.

I also heard that the gem 'search_in_file' could be useful. But it may not be necessary.

您可能想看看Nokogiri ,当您需要使用网页并从中提取信息时,它是一个很棒的宝石。

Here's a little script that does it:

#!/usr/bin/env ruby         
require 'nokogiri'
require 'open-uri'
url = "http://www.trulia.com/real_estate/Cambridge-Massachusetts/"

web_page = open(url).read
doc = Nokogiri::HTML.parse( web_page )

neighborhoods = doc.css('#most_popular td.txtL').map(&:text)
listing_prices = doc.css('#most_popular td.txtC').map(&:text)

output = neighborhoods.zip(listing_prices)
puts output.inspect

The output looks something like this

[["Mid-Cambridge", "$642,126"],
 ["North Cambridge", "$602,100"],
 ["East Cambridge", "$611,436"],
 ["Neighborhood Nine", "$1,068,284"],
 ["West Cambridge", "$1,577,444"]]

Pretty much what you're looking for, right?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM