简体   繁体   中英

Beautiful soup (python) getting value of attribute

I have some messy soup that I've been trying to parse and I can't figure out how I would do it. On the page there are a bunch of <div> tags, and I can successfully traverse through them all to find the div that I want.

This div, has a custom attribute called "data-series" , the value of which seems to be some list of dictionaries containing lists. The value of the data-series attribute looks like this:

<div data-series=
'[{"label":"Series 1","data":[[0,0.01214697],[1,0.01139803],[2,0.0101848]],"color":"#27a9e3"},
{"label":"series 2","data":[[0,0.00745604375],[1,0.00885196875],[2,0.009824050833]],"color":"#ffb848"}]'....

It then continues on with some other custom attributes. I'm looking to pull out one of the numbers within this nested mess

The value I want to end up printing out is 0.01139803 . Within the list, it is found in the first dictionary, and is the value of the "data" key. But the value of the "data" key is in itself a list, and is the second element of the second nested element ( [1][1] )

How would I pull this number out using beautiful soup?

The string for data-series is "JSON" (JavaScript Object Notation) data. You can use json.loads() to process this string into Python data structures, then manipulate the result as you would any list and dict :

>>> import json
>>> s = '[{"label":"Series 1","data":[[0,0.01214697],[1,0.01139803],[2,0.0101848]],"color":"#27a9e3"},{"label":"series 2","data":[[0,0.00745604375],[1,0.00885196875],[2,0.009824050833]],"color":"#ffb848"}]'
>>> d = json.loads(s)
>>> d[0]['data'][1][1]
0.01139803

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM