使用 beautifulsoup 抓取鏈接頁面

Question

在這里，我有關於工作的數據，但在這里我想從 href 鏈接獲取描述，我已經完成了一些代碼，我在其中放置了評論，但它在 output 中返回腳本，我只想要對工作的描述在下一頁。

url = f"https://weworkremotely.com/remote-jobs/search?term={keyword}"
html = requests.get(url, headers=headers).text
soup = BeautifulSoup(html, 'lxml')
section = soup.find_all("section", {"class": "jobs"})
for item in section:
    a = item.select("li > a")
    for item2 in a:
        if str(item2.parent['class']) != "['view-all']":
            link = f"https://weworkremotely.com{item2.get('href')}"
            htmll = requests.get(link, headers=headers).text  #this is how i'm trying to get description from new page
            soupp = BeautifulSoup(htmll, 'lxml')
            print(soupp.prettify)
            apply_list.append(link)
        else:
            continue

return company_list, title_list, apply_list

Answer 1

你用 get_text() 代替了嗎？ 或者，如果您希望抓取的站點具有 Javascript 的重要特征，那么建議切換到 Selenium

html = requests.get(link, headers=headers).get_text()

使用 beautifulsoup 抓取鏈接頁面

問題描述

1 個解決方案

解決方案1
0 2020-11-27 10:18:46

使用 beautifulsoup 抓取鏈接頁面

問題描述

1 個解決方案

解決方案1 0 2020-11-27 10:18:46

解決方案1
0 2020-11-27 10:18:46