简体   繁体   English

如何将Unicode dict转换为dict

[英]How to convert Unicode dict to dict

I am trying to convert : 我想转换:

datalist = [u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg'}"]

To list containing python dict. 列出包含python dict。 If i try to extract value using keyword i got this error: 如果我尝试使用关键字提取值我得到此错误:

for i in datalist:
    print i['smallimage']
   ....:     

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-686ea4feba66> in <module>()
      1 for i in datalist:
----> 2     print i['smallimage']
      3 

TypeError: string indices must be integers

How do i convert list containing Unicode Dict to Dict.. 如何将包含Unicode Dict的列表转换为Dict ..

You could use the demjson module which has a non-strict mode that handles the data you have: 您可以使用具有非严格模式的demjson模块来处理您拥有的数据:

import demjson

for data in datalist:
    dct = demjson.decode(data)
    print dct['gallery'] # etc...

In this case, I'd hand-craft a regular expression to make these into something you can evaluate as Python: 在这种情况下,我会手工制作一个正则表达式,使它们成为可以评估为Python的东西:

import re
import ast
from functools import partial

keys = re.compile(r'(gallery|smallimage|largeimage)')
fix_keys = partial(keys.sub, r'"\1"')

for entry in datalist:
    entry = ast.literal_eval(fix_keys(entry))

Yes, this is limited; 是的,这是有限的; but it works for this set and is robust as long as the keys match. 但它适用于这个集合,只要密钥匹配就很健壮。 The regular expression is simple to maintain. 正则表达式易于维护。 Moreover, this doesn't use any external dependencies, it's all based on batteries already included. 此外,这不使用任何外部依赖,它都是基于已经包含的电池。

Result: 结果:

>>> for entry in datalist:
...     print ast.literal_eval(fix_keys(entry))
... 
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg'}

Just as another thought, your list is properly formatted Yaml. 正如另一个想法,你的列表格式正确Yaml。

> yaml.load(u'{foo: "bar"}')['foo']
'bar'

And if you want to be really fancy and parse everything at once: 如果你想真正想要并一次解析所有内容:

> data = yaml.load('['+','.join(datalist)+']')
> data[0]['smallimage']
'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'
> data[3]['gallery']
'gal1'

If your dictionary keys were quoted, you could use json.loads to load the string. 如果引用了字典键,则可以使用json.loads加载字符串。

import json
for i in datalist:
   print json.loads(i)['smallimage']

( ast.literal_eval would have worked too...) ast.literal_eval也会工作......)

however, as it is, this will work with an old-school eval : 然而,事实上,这将适用于旧学校eval

>>> class Mdict(dict):
...     def __missing__(self,key):
...        return key
... 
>>> eval(datalist[0],Mdict(__builtins__=None))
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}

Note that this is probably vulnerable to injection attacks, so only use it if the string is from a trusted source. 请注意,这可能容易受到注入攻击,因此只有在字符串来自可靠来源时才使用它。


Finally, for anyone wanting a short, although somewhat dense solution that uses only the standard library and isn't vulnerable to injection attacks... This little gem does the trick (assuming the dictionary keys are valid identifiers)! 最后,对于任何想要一个简短但有点密集的解决方案,只使用标准库并且不易受到注入攻击的人...这个小宝石可以解决这个问题(假设字典键是有效的标识符)!

import ast
class RewriteName(ast.NodeTransformer):
    def visit_Name(self,node):
        return ast.Str(s=node.id)

transformer = RewriteName()
for x in datalist:
    tree = ast.parse(x,mode='eval')
    transformer.visit(tree)
    print ast.literal_eval(tree)['smallimage']

Your datalist is a list of unicode strings. 您的datalist是一个unicode字符串list

You could use eval , except your keys are not properly quoted. 您可以使用eval ,除非您的密钥没有正确引用。 what you can do is requote your keys on the fly with replace : 你可以做的是重新报价的上飞键replace

for i in datalist:
    my_dict = eval(i.replace("gallery", "'gallery'").replace("smallimage", "'smallimage'").replace("largeimage", "'largeimage'"))
    print my_dict["smallimage"]

I don't see why the need for all the extra things such as using re or json ... 我不明白为什么需要所有额外的东西,比如使用rejson ......

fdict = {str(k): v for (k, v) in udict.items()}

Where udict is the dict that has unicode keys. 其中udict是具有unicode键的dict Simply convert them to str . 只需将它们转换为str In your given data, you can simply... 在您给定的数据中,您可以简单地......

datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]

Simple test: 简单测试:

>>> datalist = [{u'a':1,u'b':2},{u'a':1,u'b':2}]
[{u'a': 1, u'b': 2}, {u'a': 1, u'b': 2}]
>>> datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
>>> datalist
[{'a': 1, 'b': 2}, {'a': 1, 'b': 2}]

No import re or import json . 没有import re import jsonimport json Simple and quick. 简单快捷。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM