So for improving my scraping skills, I have been trying to download the document present in https://ikeacatalogues.ikea.com/sv-1950/page/1 but when ever I am trying to get div either with or without id, all I am getting is <div id="fakescroll"</div>
and what I want is the direct link to the document which is present in an anchor tag
I am not able to access it either. I tired to find all the link present in the webpage and it is returning an empty list.
Please help. Here is my code. This code return empty output:
from bs4 import BeautifulSoup
from selenium import webdriver
url = "https://ikeacatalogues.ikea.com/sv-1950/page/1"
browser = webdriver.Chrome(executable_path="/path/to/chromedriver.exe")
browser.get(url)
soup = BeautifulSoup(browser.page_source,"html.parser")
items=soup.select(""div", {"id": "main_menu"}")
print(items)
Here is my code for getting all the href. The output is empty.
import httplib2
from bs4 import BeautifulSoup, SoupStrainer
http = httplib2.Http()
status, response = http.request('https://ikeacatalogues.ikea.com/sv-1950/page/1')
for link in BeautifulSoup(response, parse_only=SoupStrainer('a')):
if link.has_attr('href'):
print(link['href'])
The images/texts are embedded within the page inside the <script>
tags, so BeautifulSoup doesn't see them. You can use re
/ json
modules to decode it. For example:
import re
import json
import requests
url = "https://ikeacatalogues.ikea.com/sv-1950/page/1"
text = requests.get(url).text
data = re.search(r"var data = (\{.*\});", text)
data = json.loads(data.group(1))
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for s in data["spreads"]:
for p in s["pages"]:
print(p["text"])
print("https://ikeacatalogues.ikea.com" + p["images"]["at2400"])
print("#" * 80)
Prints:
...
Samma fåtölj med lös plymå
Nr 11 B Samma som ovanstående, men försedd med lös plymå, vilket är mycket popu
lärt och samtidigt lätt att rengöra. Plymån är av bästa resårkvalitet med avsydda kan
ter. Samma pris som nr 11.
Nr 11/1 Samma som nr 11, men u ta n • nackkudde och i något mindre utförande. Ty g
åtgång 1,6 meter.
Pris pr styck komplett med tyg
.................................................................................... 8 4 . 5 0
Pris pr styck utan tyg
..................................................................................................... 71.50
AB Tryckericentralen i lore*
https://ikeacatalogues.ikea.com/77436/1101577/pages/bbe136e8-1317-474e-ba07-c48a4ded045e-at2400.jpg
################################################################################
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.