简体   繁体   中英

Vs Code Issue: no code output. how do i fix this?

I am very new to coding and am trying to write a practice script for webscraping in VS Code Editor. But every time i run the script i get this issue of there being no real output. Can you please advise on what the issue is? Note: the pink boxes are just covering my name enter image description here

I tried running the code and expected webscraped data from the link. I have tried many different scripts and the same issue happens. So there must be something wrong with the whole system i think

VSCode is an excellent IDE. When you start a new project (or open a folder in VSCode), it does not come with any build tools or compilers etc. You have to manually configure them. You have to set up the environment using different toolchains. Here are some instructions for Python

This is not a problem with VSCode but I am going to answer your question.

You can't webscrape indeed.com with requests and beatiful soup because it has bot protection using cloudflare. If you take a closer look to the response it returns the 403 Forbidden status code instead of 200 OK. You can scrape using a headless browser using selenium.

Here's an example

First install selenium and webdriver_manager

pip install selenium webdriver_manager
from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

# Make sure you are not detected as HeadlessChrome, some sites will refuse access
options = ChromeOptions()
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)

driver = Chrome(options=options, service=Service(
    ChromeDriverManager().install()))

# Make sure you are not detected as HeadlessChrome, some sites will refuse access
ua = driver.execute_script("return navigator.userAgent").replace(
    "HeadlessChrome", "Chrome")
driver.execute_cdp_cmd("Network.setUserAgentOverride", {
                       "userAgent": ua})
driver.execute_script(
    "Object.defineProperty(navigator,'webdriver',{get:()=>undefined});")


driver.get("https://www.indeed.com/companies/best-Agriculture-companies")
main = driver.find_element(By.ID, "main")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM