简体   繁体   English

将字符串转换为字典列表 Python3

[英]Convert String To List of Dictionaries Python3

Using Python 3.5.2, what is the best way to convert a string into a list of dictionaries?使用 Python 3.5.2,将字符串转换为字典列表的最佳方法是什么?

I'm scraping a site, with the following being returned as a list of length 1:我正在抓取一个站点,以下内容作为长度为 1 的列表返回:

(Formatted for readability) (为了可读性而格式化)

[
{"variation_id":573,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":1099,"display_regular_price":1099,"attributes":{"attribute_pa_size":"king"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>1,099.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211693","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""},

{"variation_id":574,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":989,"display_regular_price":989,"attributes":{"attribute_pa_size":"queen"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>989.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211686","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""},

{"variation_id":575,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":949,"display_regular_price":949,"attributes":{"attribute_pa_size":"double"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>949.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211679","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""}

]

I tried converting that to a str, assigning it to 's' and then using json.loads(s), but that didn't work.我尝试将其转换为 str,将其分配给“s”,然后使用 json.loads(s),但这不起作用。

I'd like to have a list object whereby I can access values with something like:我想要一个列表对象,我可以通过它访问类似以下内容的值:

for item in form_data_returned:
    print item['variation_id']  # prints 573  574  575

Thanks谢谢

from collections import defaultdict

# Set aliases for `true` and `false` in the output so
# we won't get NameError exceptions thrown.
true = True
false = False

raw = [
{"variation_id":573,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":1099,"display_regular_price":1099,"attributes":{"attribute_pa_size":"king"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>1,099.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211693","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""},

{"variation_id":574,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":989,"display_regular_price":989,"attributes":{"attribute_pa_size":"queen"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>989.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211686","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""},

{"variation_id":575,"variation_is_visible":true,"variation_is_active":true,"is_purchasable":true,"display_price":949,"display_regular_price":949,"attributes":{"attribute_pa_size":"double"},"image_src":"","image_link":"","image_title":"","image_alt":"","image_caption":"","image_srcset":"","image_sizes":"","price_html":"<span class=\"price\"><span class=\"woocommerce-Price-amount amount\"><span class=\"woocommerce-Price-currencySymbol\">&#82;<\/span>949.00<\/span><\/span>","availability_html":"<p class=\"stock in-stock\">2 in stock<\/p>","sku":"6006239211679","weight":" kg","dimensions":"","min_qty":1,"max_qty":2,"backorders_allowed":false,"is_in_stock":true,"is_downloadable":false,"is_virtual":false,"is_sold_individually":"no","variation_description":""}

]

# keys being a set ensures that every key occurs only once.
keys = set()

# Initializing form_data_returned as a defaultdict allows
# us to access keys that are not already in form_data_returned.
# For example form_data_returned['weight'].append('kg') would throw
# KeyError exception for an empty form_data_returned had we declared
# it as a normal dict().
form_data_returned = defaultdict(list)

for dictionary in raw:
    keys.update(dictionary.keys())
    for key in keys:
        form_data_returned[key].append(dictionary[key])

We can now retrieve data by key:我们现在可以通过键检索数据:

print(form_data_returned['variation_id'])
>>> [573, 574, 575]

Use the re module to preprocess the string, then use the json module to parse it into a dictionary.使用re模块对字符串进行预处理,然后使用json模块将其解析为字典。

Assuming you have the data converted to a string, and you are know that certain rules apply to the content*, you can try the following:假设您已将数据转换为字符串,并且您知道某些规则适用于内容*,您可以尝试以下操作:

str = '...'

escaped = re.sub('(?<=[^,:{}])(\\\")(?=[^,:{}])','\\"',str)

dict = json.loads(escaped)

The regular expression (?<=[^,:{}])(\\\\\\")(?=[^,:{}]) will parse the string and identify all characters " that are not preceded by ',' , ':' , '{', '}' or followed by the same, so that the " in the strings in the data can be escaped properly.正则表达式(?<=[^,:{}])(\\\\\\")(?=[^,:{}])将解析字符串并识别所有字符"前面没有',' , ':' , '{', '}'或之后是相同的,从而使"在数据中的字符串可以被正确转义。

*by rules i mean, that you have to know, that the used regular expression finds the correct characters - if the data source can provide that consistency, the code above should work (extend the (?<=[^,:{}]) and (?=[^,:{}]) parts with the necessary characters to match all data *根据规则,我的意思是,您必须知道,所使用的正则表达式找到了正确的字符 - 如果数据源可以提供这种一致性,则上面的代码应该可以工作(扩展(?<=[^,:{}])(?=[^,:{}])部分具有必要的字符以匹配所有数据

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM