简体   繁体   中英

How to iterate over a table and print the results from the first 10 rows with python and selenium webdriver?

Using selenium webdriver and python I am able to locate the search cell and search to return results however I want to print the results from the first 10 rows returned (minus the title row).

The site I am using is: http://www.hoovers.com/company-information/company-search.html?term=simon for example as a search term.

I have been searching for a while and have tried many things including xpaths and most error out. This is the closest I've come so far:

for row in mydriver.find_elements_by_class_name('cmp-company-directory'):
        cell = row.find_elements_by_tag_name("td")[0]
        print(cell.text)

However it only returns the first row and will not iterate through the table. Any tips? TIA!

Try this below Xpath it will traverse through table and print first 10 rows.

elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td")
counter=1
for element in elements:
    print(element.text)
    counter+=1
    if counter==50:
        break

OutPut:

Simon Property Group, Inc.
Indianapolis, IN, United States
$5538.64M
See Details

SIMON & SCHUSTER (UK) LIMITED
London, London, England
$60.39M
See Details

SIMON JERSEY GROUP LIMITED
Accrington, Lancashire, England

See Details

Simon Worldwide, Inc.
Irvine, CA, United States
$0.0M
See Details

Simon Property Group, L.P.
Indianapolis, IN, United States
$5538.64M
See Details

Günter Simon e.K. Inh. Carmen Simon
Ravensburg, Baden-Württemberg, Germany

See Details

Simon e Simon Servicos Odontologicos Ltda
Vere, Parana, Brazil

See Details

Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Aparecida De Goiania, Goias, Brazil

See Details

Simon Levelt B.V.
Haarlem, Noord-Holland, The Netherlands

See Details

SIMON SAU
Barcelona, Barcelona, Spain
$115.95M
See Details

If you want to print only first 10 rows of company name try this.

elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td[@class='company_name']")
counter=0
for element in elements:
    print(element.text)
    counter+=1
    if counter==10:
        break

OutPut:-

Simon Property Group, Inc.
SIMON & SCHUSTER (UK) LIMITED
SIMON JERSEY GROUP LIMITED
Simon Worldwide, Inc.
Simon Property Group, L.P.
Günter Simon e.K. Inh. Carmen Simon
Simon e Simon Servicos Odontologicos Ltda
Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Simon Levelt B.V.

Let me know if this work for you.

To print the Company Names excluding the title row you have to induce WebDriverWait for the visibility_of_all_elements_located and you can use either of the following solutions:

  • CSS_SELECTOR :

     print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))]) 
  • XPATH :

     print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='cmp-company-directory']//table//td[@class='company_name']/a")))]) 

To print the first 10 Company Names excluding the title row you have to induce WebDriverWait for the visibility_of_all_elements_located and then you have to use [:10] to limit the list to 10 elements and you can use either of the following solutions:

  • CSS_SELECTOR :

     print([company_name.text for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))[:10]]) 
  • XPATH :

     print([company_name.text for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='cmp-company-directory']//table//td[@class='company_name']/a")))[:10]]) 

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM