简体   繁体   中英

Parsing webpage using Watir/Nokogiri in Ruby

I'm trying to parse the following website for the latitude and longitude contained in the popup box and can't seem to make it work. I'm using Watir and Nokogiri in Ruby. The code is below:

require 'watir'
require 'nokogiri'
require 'win32ole'
require 'open-uri'

# Get filename from user
puts "What is the name of the excel file?"
file_name1 = gets.chomp
file_name2 = file_name1 << '.xlsx'

# WIN32OLE
excel = WIN32OLE::new('excel.Application')
excel.visible = true
filepath = excel.Workbooks.Open('C:/users/desktop/ruby/' << file_name2)

url = 'http://webapps2.rrc.state.tx.us/EWA/drillingPermitsQueryAction.do'

# Excel Column Headers
excel.worksheets(2).Cells(1,12).value = "Latitude"
excel.worksheets(2).Cells(1,13).value = "Longitude"

# Watir
browser = Watir::Browser.new  # opens new IE browser
browser.speed = :zippy
browser.goto url  # goes to RRC page

row = 2

while excel.worksheets(2).Cells(row,5).value.nil? == false
browser.text_field(:name, 'searchArgs.apiNoHndlr.inputValue').set excel.worksheets(2).Cells(row,5).value.to_s[0..7]
    browser.button(:value, 'Submit').click   # Clicks the submit button
    browser.select_list(:name, "propertyValue").select 'GIS Viewer'
    page_html = Nokogiri::HTML.parse(browser.html)
    latitude = page_html.css("#printIdentifyWellDiv > table:nth-child(5) > tbody > tr:nth-child(7) > td").text.strip
    longitude = page_html.css("#printIdentifyWellDiv > table:nth-child(5) > tbody > tr:nth-child(8) > td").text.strip
    excel.worksheets(2).Cells(row,12).value = latitude
    excel.worksheets(2).Cells(row,13).value = longitude
    browser.window(:title => "RRC Public GIS Viewer").use do
        browser.button(:id => "close").click
    end
    browser.button(:value, 'Return').click
    row += 1
end

puts "Complete"

The problem lies in lines 34 and 35 (latitude and longitude variables). Nokogiri doesn't seem to be able to parse them from the popup and move them into the Excel file. I've tried using the Xpath and CSS path but haven't had any success. Everytime I run the program the corresponding Excel file ends up blank in the latitude and longitude columns.

Questions:

  1. How can I parse the data?
  2. As my program runs, the map screen linked above is brought up in a second tab in the browser.

Is this a problem for Watir/Nokogiri? Do I need to somehow select that tab in the program for Nokogiri to be able to parse it?

Thank you for your time.

Once you open a popup you have to tell watir to use the popup, otherwise the browser.html will still be from the main window.

Move this line:

browser.window(:title => "RRC Public GIS Viewer").use do

up before the browser.html call

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM