简体   繁体   English

Python selenium 如何从网站上抓取值列表

[英]Python selenium how to scrape list of values from website

I have a list of job roles on this website that I want to scrape.我在这个网站上有一个我想抓取的工作角色列表。 The code I am using is below:我正在使用的代码如下:

driver.get('https://jobs.ubs.com/TGnewUI/Search/home/HomeWithPreLoad?partnerid=25008&siteid=5012&PageType=searchResults&SearchType=linkquery&LinkID=6017#keyWordSearch=&locationSearch=')
job_roles = driver.find_elements(By.XPATH, '/html/body/div[2]/div[2]/div[1]/div[6]/div[3]/div/div/div[5]/div[2]/div/div[1]/ul/li[1]/div[2]/div[1]/span/a')

for job_roles in job_roles:
    text = job_roles.text
    print(text)

With this code, I am able to retrieve the first role which is: Business Analyst - IB Credit Risk Change使用此代码,我能够检索到第一个角色:业务分析师 - IB 信用风险变更

I am unable to retrieve the other roles, can someone kindly assist我无法检索其他角色,有人可以帮忙吗

Thanks谢谢

In this case all the job names have the two CSS classes jobProperty and jobtitle .在这种情况下,所有作业名称都有两个 CSS 类jobPropertyjobtitle

So, since you want all the jobs, I recommend selecting by CSS selector .所以,既然你想要所有的工作,我建议选择CSS 选择器

The following example should work:以下示例应该有效:

driver.get('https://jobs.ubs.com/TGnewUI/Search/home/HomeWithPreLoad?partnerid=25008&siteid=5012&PageType=searchResults&SearchType=linkquery&LinkID=6017#keyWordSearch=&locationSearch=')

job_roles = driver.find_elements_by_css_selector('.jobProperty.jobtitle')
for job_roles in job_roles:
    text = job_roles.text
    print(text)

If you want to use the xPath, you were very close.如果你想使用 xPath,你就非常接近了。 Your xPath specifically only selects the first li element ( li[1] ).您的 xPath 专门只选择第一个 li 元素( li[1] )。 By changing it to just li , it will find all matching xPaths:通过将其更改为li ,它将找到所有匹配的 xPath:

driver.get('https://jobs.ubs.com/TGnewUI/Search/home/HomeWithPreLoad?partnerid=25008&siteid=5012&PageType=searchResults&SearchType=linkquery&LinkID=6017#keyWordSearch=&locationSearch=')
job_roles = driver.find_elements(By.XPATH, '/html/body/div[2]/div[2]/div[1]/div[6]/div[3]/div/div/div[5]/div[2]/div/div[1]/ul/li/div[2]/div[1]/span/a')

for job_roles in job_roles:
    text = job_roles.text
    print(text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM