简体   繁体   English

如何从动态网站中提取数据?

[英]How to extract data from dynamic website?

I am trying to get the restaurant name and address of each restaurant from this platform:我正在尝试从该平台获取每家餐厅的餐厅名称和地址:

https://customers.dlivery.live/en/list

So far I tried with BeautifulSoup到目前为止,我尝试使用 BeautifulSoup

import requests
from bs4 import BeautifulSoup
import json
url = 'https://customers.dlivery.live/en/list'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) '\
           'AppleWebKit/537.36 (KHTML, like Gecko) '\
           'Chrome/75.0.3770.80 Safari/537.36'}

response = requests.get(url,headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
soup

I noticed that within soup there is not the data about the restaurants.我注意到在汤里没有关于餐馆的数据。 How can I do this?我怎样才能做到这一点?

if you inspect element the page, you will notice that the names are wrapped in the card_heading class, and the addresses are wrapped in card_distance class.如果您inspect element ,您会注意到名称包含在card_heading类中,而地址则包含在card_distance类中。

soup = BeautifulSoup(response.text, 'html.parser')
restaurantAddress = soup.find_all(class_='card_distance')

for address in restaurantAddress:
   print(address.text)

and

soup = BeautifulSoup(response.text, 'html.parser')
restaurantNames = soup.find_all(class_='card_heading')

for name in restaurantNames:
   print(name.text)

Not sure if this exact code will work, but this is pretty close to what you are looking for.不确定这个确切的代码是否有效,但这与您正在寻找的非常接近。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM