I am a newbie in Python and have a simple question on parsing html. I am using Beautiful soup to get upto this point. I want to extract the taxes and maintenance from the below. I am not sure how to do this.
div class="estimated_payment clickable overlay_trigger hidden-xs"
id="overlay_trigger_1255749" se:behavior="monthly_payment" se:monthly_payment:attributes='{"id":1255749,"taxes":3682.0,"price":5500000,"maintenance":1875.0,"mortgage_rate":3.5,"mortgage_term":30,"down_payment_amount":1100000.0,"down_payment_rate":20.0,"min_down_payment_rate":20.0,"min_down_payment_amount":1100000.0}'> Est. Payment:
You need to do it in two steps:
se:monthly_payment:attributes
attribute value json.loads()
to a Python dictionary and get the desired amounts by keys Implementation:
import json
from bs4 import BeautifulSoup
data = """
<div class="estimated_payment clickable overlay_trigger hidden-xs"
id="overlay_trigger_1255749"
se:behavior="monthly_payment"
se:monthly_payment:attributes='{"id":1255749,"taxes":3682.0,"price":5500000,"maintenance":1875.0,"mortgage_rate":3.5,"mortgage_term":30,"down_payment_amount":1100000.0,"down_payment_rate":20.0,"min_down_payment_rate":20.0,"min_down_payment_amount":1100000.0}'>
Est. Payment: $0
</div>
"""
soup = BeautifulSoup(data, "html.parser")
attr_value = soup.select_one(".estimated_payment")["se:monthly_payment:attributes"]
payment_data = json.loads(attr_value)
print(payment_data["taxes"])
print(payment_data["maintenance"])
Prints:
3682.0
1875.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.