[英]How to scrape employee counts from a LinkedIn company page using Selenium?
I'm trying to build a program that will search for an industry name, and click on the first profile in the list of results on LinkedIn and scrape exact employee count.我正在尝试构建一个程序,该程序将搜索行业名称,然后单击 LinkedIn 结果列表中的第一个个人资料并抓取确切的员工人数。 I wrote the code for it, which I thought would work, but I can't seem to understand why the code isn't returning the exact employee count.
我为它编写了代码,我认为它可以工作,但我似乎无法理解为什么代码没有返回确切的员工人数。 The xpath seems correct - any help at all would be really appeciated!
xpath 似乎是正确的 - 任何帮助都会非常受欢迎!
import time
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get('https://www.linkedin.com/login')
nameidElem = driver.find_element_by_id('username')
nameidElem.send_keys('username_here')
pwdidElem = driver.find_element_by_id('password')
pwdidElem.send_keys('password_here')
continueElem = driver.find_element_by_class_name("btn__primary--large")
result = continueElem.submit()
time.sleep(10)
industry = "books"
link = "https://www.linkedin.com/search/results/companies/?keywords=" +
industry + "&origin=GLOBAL_SEARCH_HEADER"
driver.get(link)
firstcompany = driver.find_element_by_class_name("search-result__title")
firstcompany.click()
employees = driver.find_elements_by_xpath('//*[@id="ember1274"]')
number = re.findall(r'\d', employees.text)
print(number)
Use the below xpath to get the number of employees.使用以下 xpath 获取员工人数。
//*[.='Company size']/following-sibling::*[contains(.,'employees')]
Screenshot:截屏:
Make sure to wait for the element to be presented, after clicking on the firstCompany link.单击 firstCompany 链接后,请确保等待元素出现。
Edit1:编辑1:
Use the below xpath to 'See all XX employees on LinkedIn'使用以下 xpath 来“查看 LinkedIn 上的所有XX员工”
//a[@data-control-name='topcard_see_all_employees']/span
CSS: CSS:
a[data-control-name='topcard_see_all_employees'] span
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.