简体   繁体   中英

ruby selenium web driver - get google knowledge graph content

I am using ruby selenium web driver and trying to get content of google knowledge graph which locates on the top right-hand site from search results on the first google search results page in the <div class="xpdopen">

@driver = Selenium::WebDriver.for :phantomjs
@driver.manage.timeouts.implicit_wait = 10
@driver.get "http://google.com"
element = @driver.find_element :name => "q"
element.send_keys "BMW"
element.submit
content = @driver.find_element(:class, 'xpdopen')

But selenium can not find this element and rises error

#<Selenium::WebDriver::Error::NoSuchElementError: {"errorMessage":"Unable to find element with class name 'xpdopen'"

When I tried in a chrome js console $('.xpdopen') it found this element straight away

I also have tried

@driver.execute_script("return document.getElementsByClassName('xpdopen');")

but it can not find this element

I also tied @driver.page_source and <div class="xpdopen"> not in the page source but I can see it in the chrome console. Why?

How can I get this element with selenium?

Here are results what I am getting from pry:

[21] pry(main)> @driver = Selenium::WebDriver.for :phantomjs
=> #<Selenium::WebDriver::Driver:0x..f822d288ec7f0a708 browser=:phantomjs>
[22] pry(main)> @driver.manage.timeouts.implicit_wait = 10    
=> 10
[23] pry(main)> @driver.get "http://google.com"    
=> {}
[24] pry(main)> element = @driver.find_element :name => "q"    
=> #<Selenium::WebDriver::Element:0x..f389f4a8876f601e id=":wdc:1434526425103">
[25] pry(main)> element.send_keys "BMW"    
=> nil
[26] pry(main)> element.submit    
=> {}
[27] pry(main)> sleep 10    
=> 10
[28] pry(main)> content = @driver.find_element(:xpath, '//*[@id="rhs_block"]/ol/li/div[1]/div')    
Selenium::WebDriver::Error::NoSuchElementError: {"errorMessage":"Unable to find element with xpath '//*[@id=\"rhs_block\"]/ol/li/div[1]/div'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"gzip;q=1.0,deflate;q=0.6,identity;q=0.3","Connection":"close","Content-Length":"67","Content-Type":"application/json; charset=utf-8","Host":"127.0.0.1:8929","User-Agent":"Ruby"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"xpath\",\"value\":\"//*[@id=\\\"rhs_block\\\"]/ol/li/div[1]/div\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/2f3cf350-14c3-11e5-9f8e-4173e8049986/element"}} (org.openqa.selenium.NoSuchElementException)

and

[29] pry(main)> content = @driver.find_element(:css, "#rhs_block > ol > li > div.kp-blk._Jw._Rqb._RJe > .xpdopen")
Selenium::WebDriver::Error::NoSuchElementError: {"errorMessage":"Unable to find element with css selector '#rhs_block > ol > li > div.kp-blk._Jw._Rqb._RJe > .xpdopen'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"gzip;q=1.0,deflate;q=0.6,identity;q=0.3","Connection":"close","Content-Length":"113","Content-Type":"application/json; charset=utf-8","Host":"127.0.0.1:8929","User-Agent":"Ruby"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"css selector\",\"value\":\"#rhs_block \\u003e ol \\u003e li \\u003e div.kp-blk._Jw._Rqb._RJe \\u003e .xpdopen\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/2f3cf350-14c3-11e5-9f8e-4173e8049986/element"}} (org.openqa.selenium.NoSuchElementException)

and just to demonstrate that it is finding other elements on the same page without problems:

[30] pry(main)> results = @driver.find_elements(:xpath, "//p/a") 
=> [#<Selenium::WebDriver::Element:0x6f6a74631e2b7010 id=":wdc:1434527087873">,
 #<Selenium::WebDriver::Element:0x7b6d276448081688 id=":wdc:1434527087874">,
 #<Selenium::WebDriver::Element:0x..f9504a4171b03970a id=":wdc:1434527087875">,
 #<Selenium::WebDriver::Element:0x..fa6e0158aa8d24e2a id=":wdc:1434527087876">,
 #<Selenium::WebDriver::Element:0x327bf842e4399368 id=":wdc:1434527087877">,
 #<Selenium::WebDriver::Element:0x..fae292d7ca211ab32 id=":wdc:1434527087878">,
 #<Selenium::WebDriver::Element:0x129a58eb5ed6ee9c id=":wdc:1434527087879">,
 #<Selenium::WebDriver::Element:0x46ef3b45800e63e0 id=":wdc:1434527087880">,
 #<Selenium::WebDriver::Element:0x26bfb47f8ad498ea id=":wdc:1434527087881">,
 #<Selenium::WebDriver::Element:0x..f03756c2924a2974 id=":wdc:1434527087882">,
 #<Selenium::WebDriver::Element:0xfba93aab4b32af8 id=":wdc:1434527087883">]

I took screenshot with and found out that phantomjs does not display (does not have content) knowledge graph

Screenshot from phantomjs

phantomjs页面内容

Screenshot from Firefox Firefox页面内容

Why phantomjs does not have content knowledge graph?

Apparently css won't know where to find xpdopen class by it's own, you have to give the entire path to the element:

Xpath:

content = @driver.find_element(:xpath, "//*[@id="rhs_block"]/ol/li/div[1]/div")

Css:

content = @driver.find_element(:css, "#rhs_block > ol > li > div.kp-blk._Jw._Rqb._RJe > .xpdopen")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM