简体   繁体   English

Selenium Python 无法在带有单独滚动条的 web 页面的内部找到元素

[英]Selenium Python can't find elements in a inner part of a web page with a separated scroll bar

I'm trying to extract the names of 1268 companies from the website of industrial fair which has uploaded the exhibitors list on this page .我试图从工业博览会的网站上提取 1268 家公司的名称,该网站已在 此页面上上传了参展商名单。

Unfortunately selenium seems not to find the elements into the specific inner part of the webpage that contains companies' names and that has its own scrollbar.不幸的是 selenium 似乎没有在包含公司名称并有自己的滚动条的网页的特定内部部分找到元素。

页面渲染

Here's my code:这是我的代码:

g = webdriver.Chrome()
g.get("https://www.ecomondo.com/elenco-espositori/espositori-ecomondo")
g.maximize_window()
time.sleep(2)
cookie = WebDriverWait(g,15).until(
    EC.presence_of_element_located((By.XPATH, '//*[@id="c-p-bn"]'))
)
cookie.click()
time.sleep(5)
element = g.find_element_by_class_name('sc-1aq2rfp-0 sc-li856a-3 eqfJYB euBeDv')
ActionChains(g).move_to_element_with_offset(element, 0, 0).perform()

company_name = g.find_elements_by_xpath('//*[@id="__next"]/div[3]/div/div/div/div/div/a/div/div/span[1]')
print(company_name)

I also tried finding the element by xpath but the result is the same:我也尝试通过 xpath 找到元素,但结果是一样的:

Message: no such element: Unable to locate element: {"method":"css selector","selector":".sc-1aq2rfp-0 sc-li856a-3 eqfJYB euBeDv"}

After find the element I should scroll the sidebar down to make all the 1268 companies' name visible and eventually extract them but these are other stories.找到元素后,我应该向下滚动侧边栏以使所有 1268 个公司的名称可见并最终提取它们,但这些是其他故事。

Any hints?有什么提示吗?

The desired elements are within an <iframe> so you have to:所需的元素在<iframe>中,因此您必须:

  • Induce WebDriverWait for the desired frame to be available and switch to it .诱导WebDriverWait以等待所需的框架可用并切换到它

  • Induce WebDriverWait for the desired elements to be visible .诱导WebDriverWait使所需的元素可见

  • You can use the following Locator Strategies :您可以使用以下定位器策略

     driver.get('https://www.ecomondo.com/elenco-espositori/espositori-ecomondo') WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='ACCETTA TUTTI I COOKIE E CONTINUA']"))).click() WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[starts-with(@src, 'https://ecomondo.app.swapcard.com/widget/event/ecomondo-and-key-energy-2020/exhibitors')]"))) print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='infinite-scroll-component ']//a//following::span[1]")))])
  • Console Output:控制台 Output:

     ['2LNG', 'D5/080', '3M ITALIA SRL', 'D3/011', '3META SRL', 'C6/002', '3U VISION SRL', 'A3/178', '3V GREEN EAGLE SPA', 'C1/105', '4 ESSE SRL', 'B4/007', '4SERVICE EUROPE S.R.L.', 'A3/027', '9-TECH SRL', 'SUD/064', 'ABC. BILANCE SRL', 'A2/068', 'A.C.R. DI REGGIANI ALBERTINO SPA', 'C1/152', 'AEC. SRL', 'A7/014', 'AIDPI ASSOCIAZIONE IMPRESE DISINFESTAZIONEPROFESSIONALI ITALIANE', 'A5-C5/002 F', 'AIR.E.C. - ASSOCIAZIONE ITALIANA DEL RECUPERO ENERGETICO DA COMBUSTIBILI SOLIDI SECONDARI', 'B1/013', 'AMS SPA - ATTREZZATURE MECCANICHE SPECIALI', 'C7/003', 'AT RICAMBI SRL', 'B1/164', 'AU ESSE SRL', 'A5/036', 'A2A ENERGIA SPA', 'B5/040', 'A2A ENERGY SOLUTIONS SRL', 'B5/040', 'A2A SPA', 'B5/040', 'AB ENERGY SPA', 'B5-D5/005', "ABICERT L' ENTE DI CERTIFICAZIONE", 'C4/009', "ACCIAI DI QUALITA' SPA", 'A1/058', 'ACEA SPA', 'D1/160', 'ACQUEDOTTO PUGLIESE SPA', 'D2/002', 'ACR+', 'B5/008', 'ACTA SRL', 'C1/002', 'ADAMBÍ - ADGENERA SRL', 'A5/143', 'ADAMOLI SRLS', 'C5/140', 'ADDAX MOTORS NV', 'A6/022', 'ADEME AGENCE DE LA TRANSITION ECOLOGIQUE', 'ADICOMP S.R.L.', 'D5/133', 'ADRIAECO', 'ADRIATICA ACQUE SRL', 'B4/029', 'AEBI SCHMIDT ITALIA SRL', 'A7/002', 'AEBIG - ASOCIACION ESPANOLA DE BIOGAS', 'ÆVOLUTION MATEUSZ WIELOPOLSKI CONSULTING', 'B5/045', 'AFFILOR SRL', 'A1/001', 'AGECO DUE SPA', 'C1/186', 'AGENCE NATIONALE DES DECHETS', 'B5/008', 'AGENDA SRL - EDIZIONI ACQUAGENDA, GASAGENDA, WATERGAS.IT', 'D3/192', "AGENZIA REGIONALE PER LA TUTELA DELL'AMBIENTE", 'D4/032', 'AGER PUGLIA', 'D2/002', 'AGRICOLUS SRL', 'SUD/063', 'AGRICOM SRL', 'D5/157', 'AGRIGARDEN AMBIENTE SRL A SOCIO UNICO', 'C1/056', 'AGRIPLAST SRL', 'C1/139', 'AGRITECH SRL', 'D4/011', 'AGROTEL GMBH', 'D5/029', 'AIMAG SPA', 'B3/109', 'AIR CLEAN SRL', 'A3/101']
  • Note : You have to add the following imports:注意:您必须添加以下导入:

     from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

Reference参考

You can find a couple of relevant discussions in:您可以在以下位置找到几个相关的讨论:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM