I'm trying to grab the name and price of each of the products on the website https://store.com/shop .
When I manually view the website I can see the HTML code for each product but when I try to view it on beautiful soup using python I don't see it.
I think the problem is that the website displays the product on some sort of widget so it is not visible on the source code, but I am not sure.
my_url = 'https://store.com/shop'
headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15"}
##opens connection, grabbing page
source = requests.get(my_url, headers=headers)
html = source.content
soup = BeautifulSoup(html, 'lxml')
print (soup.prettify())
The products are loaded dynamically via sending a GET
request to:
https://roeblingliquors.com/api/v1/products/search.json?additional_properties%5Btype%5D%5B%5D=Spirits&new_style=true&merchant_id=5b19b7150fb4f72d6831344b&limit=20&skip=0&api_key=e0d3a091dc0d81547d6e168be2b3492a&sdk_guid=32560d92-28f6-e067-7286-3f505a73e61a&client_origin=app%3A%2F%2Fstorefront.5b19b7150fb4f72d6831344b
the data is in JSON format.
You can extract the products with just requests
, there's no need to use BeautifulSoup
.
The following gets the data from all the pages!
import requests
url = "https://roeblingliquors.com/api/v1/products/search.json?additional_properties%5Btype%5D%5B%5D=Spirits&new_style=true&merchant_id=5b19b7150fb4f72d6831344b&limit=20&skip=0&api_key=e0d3a091dc0d81547d6e168be2b3492a&sdk_guid=32560d92-28f6-e067-7286-3f505a73e61a&client_origin=app%3A%2F%2Fstorefront.5b19b7150fb4f72d6831344b"
response = requests.get(url).json()
# This will left align the text by amount specified
fmt_string = "{:<70} {:<15} {:<10}"
print(fmt_string.format("Name", "Measure", "Price"))
print("-" * 100)
for data in response["data"]["products"]:
for product in data["merchants"][0]["product_options"]:
measure = (
product["option_params"]["size"]["measure"]
+ " "
+ product["option_params"]["size"]["quantity"]
)
price = product["price"]
print(fmt_string.format(data["name"], measure, price))
Output (truncated):
Name Measure Price
----------------------------------------------------------------------------------------------------
Plantation Rum Extra Old 20th Ann ml 750 68.2
Elijah Craig Barrel Proof Bourbon A121 ml 750 109.99
Elijah Craig Small Batch Kentucky Straight Bourbon Whiskey 94 Proof ml 750 37.38
Tito's Vodka ml 50 2.53
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.