Python - web 爬行/相同代码的不同结果？ /请求，bs4 / M1

Question

I learning python for web crawling, but i'm totally stuck.我学习 python 用于 web 爬行，但我完全卡住了。

Each time I run this codes, results change.每次我运行此代码时，结果都会发生变化。

very rarely, it works but almost return empty list.很少，它可以工作，但几乎返回空列表。

why does it happen?为什么会这样？ please let me know请告诉我

from indeed import extract_indeed_pages, extract_indeed_jobs


last_indeed_page = extract_indeed_pages()

print(last_indeed_page)

indeed_jobs = extract_indeed_jobs(last_indeed_page)

print(indeed_jobs)

import requests
from bs4 import BeautifulSoup

LIMIT = 50
URL = f"https://kr.indeed.com/jobs?q=React&l=%EC%84%9C%EC%9A%B8&radius=100&jt=fulltime&limit={LIMIT}"


def extract_indeed_pages():
    result = requests.get(URL)
    soup = BeautifulSoup(result.text, "html.parser")
    pagination = soup.find("div", {"class": "pagination"})

    links = pagination.find_all('a')
    pages = []
    for link in links[:-1]:
        pages.append(int(link.string))

    max_page = pages[-1]
    return max_page


def extract_indeed_jobs(last_page):

    jobs = []
    
    result = requests.get(f"{URL}&start={0*LIMIT}")
    soup = BeautifulSoup(result.text, "html.parser")
    results = soup.find_all("h2", {"class": "jobTitle"})
    jobs.append(results)

    return jobs

Answer 1

This happens because of the javascript on the source code.发生这种情况是因为源代码上的 javascript。 You can view the web page by pressing the ctrl + u buttons on your pc.您可以通过按电脑上的ctrl + u按钮查看 web 页面。

Python - web 爬行/相同代码的不同结果？ /请求，bs4 / M1

问题描述

1 个解决方案

解决方案1
0 2021-05-14 18:44:59

Python - web 爬行/相同代码的不同结果？ /请求，bs4 / M1

问题描述

1 个解决方案

解决方案1 0 2021-05-14 18:44:59

解决方案1
0 2021-05-14 18:44:59