简体   繁体   English

如何使用 Selenium 从 LinkedIn 公司页面中抓取员工数量?

[英]How to scrape employee counts from a LinkedIn company page using Selenium?

I'm trying to build a program that will search for an industry name, and click on the first profile in the list of results on LinkedIn and scrape exact employee count.我正在尝试构建一个程序,该程序将搜索行业名称,然后单击 LinkedIn 结果列表中的第一个个人资料并抓取确切的员工人数。 I wrote the code for it, which I thought would work, but I can't seem to understand why the code isn't returning the exact employee count.我为它编写了代码,我认为它可以工作,但我似乎无法理解为什么代码没有返回确切的员工人数。 The xpath seems correct - any help at all would be really appeciated! xpath 似乎是正确的 - 任何帮助都会非常受欢迎!

import time
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup

driver = webdriver.Chrome()
driver.get('https://www.linkedin.com/login')

nameidElem = driver.find_element_by_id('username')
nameidElem.send_keys('username_here')

pwdidElem = driver.find_element_by_id('password')
pwdidElem.send_keys('password_here')

continueElem = driver.find_element_by_class_name("btn__primary--large")
result = continueElem.submit()
time.sleep(10)

industry = "books"
link = "https://www.linkedin.com/search/results/companies/?keywords=" + 
industry + "&origin=GLOBAL_SEARCH_HEADER"
driver.get(link)

firstcompany = driver.find_element_by_class_name("search-result__title")
firstcompany.click()

employees = driver.find_elements_by_xpath('//*[@id="ember1274"]')
number = re.findall(r'\d', employees.text)
print(number)

Use the below xpath to get the number of employees.使用以下 xpath 获取员工人数。

//*[.='Company size']/following-sibling::*[contains(.,'employees')]

Screenshot:截屏:

在此处输入图片说明

Make sure to wait for the element to be presented, after clicking on the firstCompany link.单击 firstCompany 链接后,请确保等待元素出现。

Edit1:编辑1:

Use the below xpath to 'See all XX employees on LinkedIn'使用以下 xpath 来“查看 LinkedIn 上的所有XX员工”

//a[@data-control-name='topcard_see_all_employees']/span

CSS: CSS:

a[data-control-name='topcard_see_all_employees'] span

Screenshot:截屏: 在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM