简体   繁体   English

尝试使用 BS4 从网站上使用 Python 抓取数据

[英]Trying to webcrawl data using Python from the website using BS4

I am trying to import data from the URL(mentioned in the code).我正在尝试从 URL 导入数据(在代码中提到)。 When I run the code, I do not get any information(like the plan name and rates) and gives me container div tags but not the contents.当我运行代码时,我没有得到任何信息(如计划名称和费率),并且给了我容器 div 标签,但没有给我内容。 Also, I tried response.text but it gave me no results.I do not want to use Selenium.另外,我尝试了 response.text 但它没有给我任何结果。我不想使用 Selenium。 Is there a way to solve it?有没有办法解决它?

from bs4 import BeautifulSoup
import urllib

from urllib.request import urlopen

URL="https://www.energymadeeasy.gov.au/plan?id=POW15475MBE3&postcode=2000"
response=urlopen(URL)
html_content=BeautifulSoup(response)
print(html_content)

or或者

soup=BeautifulSoup(requests.get(URL).text,'lxml')
print(soup)

I tried to extract the header using below我尝试使用以下方法提取 header

h1=html_content.find("div", {"class":"header-left"})
print(h1)

The website makes ajax call behind to load the data.该网站在后面调用 ajax 来加载数据。

There are 2 xhr calls that are made to load the data.有 2 个 xhr 调用来加载数据。 Probably you are looking at one of them.可能您正在查看其中之一。

import requests, json
res = requests.get("https://api.energymadeeasy.gov.au/plans/dpids/POW15475MBE3")
with open("data.json", "w") as f:
    json.dump(res.json(), f)

The saves the json to the file.将 json 保存到文件中。

Sample data in file:文件中的示例数据:

[{"planData": {"planType": "M", "tariffType": "TOU", "contract": [{"pricingModel": "TOU", "benefitPeriod": "1 year", "coolingOffDays": 10, "solarFit": [{"type": "R", "description": "Powerdirect Retailer Feed-in Tariff (exc. GST if any)", "rate": 9.5}], "additionalFeeInformation": "Additional fees and charges may apply. Please see the Powerdirect fee schedules at powerdirect.com.au/fees", "fee": [{"description": "Fee may be charged when reconnecting or reading your meter when you move into a property or change retailer. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "ConnF", "feeTerm": "F"}, {"description": "Fee may be charged when reconnecting in other circumstances, such as after disconnection for non-payment. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "RecoF", "feeTerm": "F"}, {"description": "Fee may be charged when disconnecting or reading your meter when you move out of a property or change retailer. Includes GST. Fees may vary.", "amount": 12.55, "feeType": "DiscoFMO", "feeTerm": "F"}, {"description": 
...
...
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM