I am trying to get the html code from a web page but I only get like 1/4 of the page showing.
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.hltv.org/matches")
print(driver.page_source)
It feels like I have tried everything but still get the same result. It doesn't start at the top. It starts far far down, almost at the end.
Anyone got a clue?
Try the below code. this worked for me
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.hltv.org/matches")
file = open("asd.html", "a", encoding='utf8')
file.write(driver.page_source)
file.close()
It could be because your get has not finished loading the page at the time that your printing is happening.
To fix this you could try waiting for a known element to load before printing.
To wait for an element ("backToLoginDialog" in the example below) to load, adjust your code to be like the following:
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
# set up driver and page load timeout
driver = webdriver.Chrome()
timeout = 5
# create your "wait" function
def wait_for_load(element_id):
element_present = EC.presence_of_element_located((By.ID, element_id))
WebDriverWait(driver, timeout).until(element_present)
driver.get('https://www.hltv.org/matches')
wait_for_load('backToLoginDialog')
print(driver.page_source)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.