简体   繁体   English

使用漂亮的汤解析html

[英]parsing html using beautiful soup

I am a newbie in Python and have a simple question on parsing html. 我是Python的新手,对解析html有一个简单的问题。 I am using Beautiful soup to get upto this point. 我正在使用美丽汤来达到这一点。 I want to extract the taxes and maintenance from the below. 我想从下面提取税收和维护费用。 I am not sure how to do this. 我不确定该怎么做。

div class="estimated_payment clickable overlay_trigger hidden-xs" div class =“ estimated_pa​​yment clickable overlay_trigger hidden-xs”
id="overlay_trigger_1255749" se:behavior="monthly_payment" se:monthly_payment:attributes='{"id":1255749,"taxes":3682.0,"price":5500000,"maintenance":1875.0,"mortgage_rate":3.5,"mortgage_term":30,"down_payment_amount":1100000.0,"down_payment_rate":20.0,"min_down_payment_rate":20.0,"min_down_payment_amount":1100000.0}'> Est. id =“ overlay_trigger_1255749” se:behavior =“ monthly_payment” se:monthly_payment:attributes ='{“ id”:1255749,“ taxes”:3682.0,“ price”:5500000,“ maintenance”:1875.0,“ mortgage_rate”:3.5, “按揭贷款期限”:30,“预付款项金额”:1100000.0,“预付款项利率”:20.0,“最低预付款项利率”:20.0,“最低预付款项金额”:1100000.0}'> Est。 Payment: $0 付款: $ 0

You need to do it in two steps: 您需要分两个步骤进行操作:

  • locate the element and extract the se:monthly_payment:attributes attribute value 找到元素并提取se:monthly_payment:attributes属性值
  • load it via json.loads() to a Python dictionary and get the desired amounts by keys 通过json.loads()将其加载到Python字典中,并通过键获取所需的数量

Implementation: 实现方式:

import json

from bs4 import BeautifulSoup


data = """
<div class="estimated_payment clickable overlay_trigger hidden-xs"
     id="overlay_trigger_1255749"
     se:behavior="monthly_payment"
     se:monthly_payment:attributes='{"id":1255749,"taxes":3682.0,"price":5500000,"maintenance":1875.0,"mortgage_rate":3.5,"mortgage_term":30,"down_payment_amount":1100000.0,"down_payment_rate":20.0,"min_down_payment_rate":20.0,"min_down_payment_amount":1100000.0}'>
     Est. Payment: $0
</div>
"""
soup = BeautifulSoup(data, "html.parser")

attr_value = soup.select_one(".estimated_payment")["se:monthly_payment:attributes"]
payment_data = json.loads(attr_value)

print(payment_data["taxes"])
print(payment_data["maintenance"])

Prints: 印刷品:

3682.0
1875.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM