[英]Linkedin scraping, list comprehension
I am trying to get profile urls using list comprehension, but the list overwrites itself and I can't store urls from different pages, its stores only from the last page.我正在尝试使用列表理解获取配置文件 url,但列表会覆盖自身,我无法存储来自不同页面的 url,它仅存储最后一页。
options = Options()
options.add_argument("--start-maximized")
# options.headless = True
url = "https://www.linkedin.com/login?fromSignIn=true&trk=guest_homepage-basic_nav-header-signin"
driver = webdriver.Chrome(r"path", options=options)
driver.get(url)
driver.find_element_by_id('username').send_keys('name')
driver.find_element_by_id('password').send_keys('pass', Keys.ENTER)
driver.implicitly_wait(10)
driver.find_element_by_class_name('search-global-typeahead__input').send_keys('Marketing manager', Keys.ENTER)
driver.implicitly_wait(10)
driver.find_element_by_xpath('//button[text()="People"]').click()
x = 0
profile = []
linklist = []
condition = True
while condition:
sleep(2)
driver.execute_script("window.scrollTo(0, 1400);")
driver.implicitly_wait(10)
linkedin_members = driver.find_elements_by_xpath('//span[@class="entity-result__title"]')
links = [linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('href') for linkedin_member in linkedin_members if "/in/" in linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('href')]
x = x + 1
driver.implicitly_wait(10)
if x == 3:
condition = False
driver.find_element_by_xpath("""//button[@class='artdeco-pagination__button artdeco-pagination__button--next artdeco-button artdeco-button--muted artdeco-button--icon-right artdeco-button--1 artdeco-button--tertiary ember-view' and contains(.,'Next')]""").click()
for l in links:
driver.get(l)
You should append your information to a list outside the while loop for example:您应该将 append 您的信息添加到 while 循环之外的列表中,例如:
links = [linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('href') for linkedin_member in linkedin_members if "/in/" in linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('href')]
#append result of listcomprehension to linklist
linklist.append(links)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.