简体   繁体   English

解析此数据的最佳Pythonic方法是什么?

[英]What is the best Pythonic way to parse this data?

I'm newer to Python and am trying to find the most Pythonic way to parse a response from an LDAP query. 我是Python的新手,正在尝试找到最Python的方式来解析LDAP查询的响应。 So far what I have works but I'd like to make it neater if possible. 到目前为止,我的作品仍然有效,但如果可能的话,我希望使其更整洁。 My response data is this: 我的回复数据是这样的:

"[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

Out of that data I'm really only interested in the fields within the {} so that I can throw it into a dictionary... 在这些数据中,我实际上只对{}中的字段感兴趣,因此可以将其放入字典中...

"department:theDepartment,mail:theEmail@mycompany.com"

What I'm doing now feels (and looks) really brute-force but works. 我现在正在做的事情感觉(看起来)确实很蛮力,但是行得通。 I've added in extra commenting and output results based on what each step is doing to try and elaborate on this mess. 我已经添加了额外的注释,并根据每个步骤在尝试详细说明此混乱的过程中输出了结果。

#Original String
#"[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

#split at open {, take the latter half
myDetails = str(result_set[0]).split('{') 
#myDetails[1] = ["'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"]

#split at close }, take the former half
myDetails = str(myDetails[1]).split('}') 
#myDetails[0] = ["'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']"]

#split at comma to separate the two response fields
myDetails = str(myDetails[0]).split(',') 
#myDetails = ["'department': ['theDepartment']","'mail': ['theEmail@mycompany.com']"]

#clean up the first response field
myDetails[0] = str(myDetails[0]).translate(None, "'").translate(None," [").translate(None,"]") 
#myDetails[0] = ["department:theDepartment"]

#clean up the second response field
myDetails[1] = str(myDetails[1]).translate(None," '").translate(None, "'").translate(None,"[").translate(None,"]")
#myDetails[1] = ["mail:theEmail@mycompany.com"]

While I'm a big fan of "if it ain't broke, don't fix it" I'm a bigger fan of efficiency. 虽然我是“如果还没有破裂,请不要解决”的忠实拥护者,但我还是效率的忠实拥护者。

EDIT This ended up working for me per the accepted answer below by @Mario 编辑这最终为我按照下面@Mario接受的答案为我工作

myUser = ast.literal_eval(str(result_set[0]))[0][1] 
myUserDict = { k: v[0] for k, v in myUser.iteritems() }

Trusting your input and counting on its strict regularity, this will parse your example data and produce what it is you're expecting: 信任您的输入并依靠其严格的规律性,这将解析您的示例数据并产生您期望的结果:

import ast

ldapData = "[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"

# Using the ast module's function is much safer than using eval. (See below!)
obj = ast.literal_eval(ldapData)[0][0]
rawDict = obj[1]
data = { k: v[0] for k, v in rawDict.iteritems() }

# The dictionary.
print data

The line using the curly brackets is called a dict comprehension. 使用大括号的行称为dict理解。


Edit: Another user on this thread suggests using the ast.literal_eval function. 编辑:该线程上的另一个用户建议使用ast.literal_eval函数。 I have to agree, after researching this. 经过研究后,我必须同意。 The eval function will execute any string. eval函数将执行任何字符串。 If the input was something like this, you'd have a big problem: 如果输入是这样的,那么您将有一个大问题:

eval("__import__('os').system('rm -R *')") 

On the other hand, if this same string was parsed with the ast function, you would get an exception: 另一方面,如果使用ast函数解析了相同的字符串,则会出现异常:

>>> import ast
>>> ast.literal_eval("__import__('os').system('rm -R *')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/ast.py", line 80, in literal_eval
    return _convert(node_or_string)
  File "/usr/lib64/python2.7/ast.py", line 79, in _convert
    raise ValueError('malformed string')
ValueError: malformed string
>>> 

Further discussion can be found here: 进一步的讨论可以在这里找到:

http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

The module's documentation is here: 该模块的文档在这里:

https://docs.python.org/2/library/ast.html https://docs.python.org/2/library/ast.html

Considering this uses ast.literal_eval it's not perfect but it sure is cleaner 考虑到这使用ast.literal_eval并不完美,但它肯定更干净

>>> import ast
>>> a = "[[('CN=LName\\, FName,OU=MinorUserGroup,OU=MajorUserGroup,DC=my,DC=company,DC=com', {'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']})]]"                                                                                                                                                                    
>>> ast.literal_eval(a)[0][0][1]
{'department': ['theDepartment'], 'mail': ['theEmail@mycompany.com']}
>>> type(ast.literal_eval(a)[0][0][1])                                                                                                                               
<type 'dict'>                                                                                                                                                        

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM