Beautiful Soup returning empty html

Question

So this is my second question regarding Beautiful Soup (sorry, im a beginner)

I was trying to fetch data from this website:

https://www.ccna8.com/ccna4-v6-0-final-exam-full-100-2017/

My Code:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

url = 'https://www.ccna8.com/ccna4-v6-0-final-exam-full-100-2017/'

uClient = uReq(url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "lxml")

print(page_soup)

But for some reason it returns an empty string.

I've been searching for similar threads and apparently it has something to do with the website using external api's , but this website doesn't.

Answer 1

It seems that the content-type of the response if gzip so you need to handle that before you can process the html response.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import gzip

url = 'https://www.ccna8.com/ccna4-v6-0-final-exam-full-100-2017/'

uClient = uReq(url)
page_html = gzip.decompress(uClient.read())
uClient.close()
page_soup = soup(page_html, "lxml")
print(page_soup)

Answer 2

try using requests module

Ex:

import requests
from bs4 import BeautifulSoup as soup

url = 'https://www.ccna8.com/ccna4-v6-0-final-exam-full-100-2017/'

uClient = requests.get(url)
page_soup = soup(uClient.text, "lxml")
print(page_soup)

Beautiful Soup returning empty html

Question

2 answers

solution1
2 ACCPTED 2018-03-30 15:29:36

solution2
1 2018-03-30 15:19:37

Beautiful Soup returning empty html

Question

2 answers

solution1 2 ACCPTED 2018-03-30 15:29:36

solution2 1 2018-03-30 15:19:37

solution1
2 ACCPTED 2018-03-30 15:29:36

solution2
1 2018-03-30 15:19:37