[英]convert elements in a list to dictionary
I would like to convert the data into a dictionary to work with.我想将数据转换成字典来使用。 The data looks like keys and values in a dictionary, but they are combined into a single element.数据看起来像字典中的键和值,但它们组合成一个元素。
here's a sample of the data这是数据样本
['"acetic anydride": "[CX3](=[OX1])[OX2][CX3](=[OX1])",\n',
'"acetylenic carbon": "[$([CX2]#C)]",\n',
'"acyl bromide": "[CX3](=[OX1])[Br]",\n',
'"acyl chloride": "[CX3](=[OX1])[Cl]",\n',
'"acyl fluoride": "[CX3](=[OX1])[F]",\n',
'"acyl iodide": "[CX3](=[OX1])[I]",\n',
'"aldehyde": "[CX3H1](=O)[#6]",\n',
'"alkane": "[CX4]",\n',
'"allenic carbon": "[$([CX2](=C)=C)]",\n',
'"amide": "[NX3][CX3](=[OX1])[#6]",\n',
'"amidium": "[NX3][CX3]=[NX3+]",\n',
'"amino acid": "[$([NX3H2,NX4H3+]),$([NX3H](C)(C))][CX4H]([*])[CX3](=[OX1])[OX2H,OX1-,N]",\n',
'"azide": "[$(-[NX2-]-[NX2+]#[NX1]),$(-[NX2]=[NX2+]=[NX1-])]",\n',
'"azo nitrogen": "[NX2]=N",\n',
'"azole": "[$([nr5]:[nr5,or5,sr5]),$([nr5]:[cr5]:[nr5,or5,sr5])]",\n',
'"azoxy nitrogen": "[$([NX2]=[NX3+]([O-])[#6]),$([NX2]=[NX3+0](=[O])[#6])]",\n',
'"diazene": "[NX2]=[NX2]",\n',
'"diazo nitrogen": "[$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]",\n',
'"bromine": "[Br]",\n']
I have tried removing the: in the data using the replace command, but it didn't work.我尝试使用替换命令删除数据中的:,但没有成功。
i=0
for line in lines:
a = lines[i]
a.replace(":", "")
lines[i] = a
i+=1
d = {}
for line in lines:
s = line.split(":")
d[s[0].strip(' "')] = s[1].strip(' ",\n')
You can use eval
:您可以使用eval
:
ll = ['"acetic anydride": "[CX3](=[OX1])[OX2][CX3](=[OX1])",\n',
'"acetylenic carbon": "[$([CX2]#C)]",\n',
'"acyl bromide": "[CX3](=[OX1])[Br]",\n',
'"acyl chloride": "[CX3](=[OX1])[Cl]",\n',
'"acyl fluoride": "[CX3](=[OX1])[F]",\n',
'"acyl iodide": "[CX3](=[OX1])[I]",\n',
'"aldehyde": "[CX3H1](=O)[#6]",\n',
'"alkane": "[CX4]",\n',
'"allenic carbon": "[$([CX2](=C)=C)]",\n',
'"amide": "[NX3][CX3](=[OX1])[#6]",\n',
'"amidium": "[NX3][CX3]=[NX3+]",\n',
'"amino acid": "[$([NX3H2,NX4H3+]),$([NX3H](C)(C))][CX4H]([*])[CX3](=[OX1])[OX2H,OX1-,N]",\n',
'"azide": "[$(-[NX2-]-[NX2+]#[NX1]),$(-[NX2]=[NX2+]=[NX1-])]",\n',
'"azo nitrogen": "[NX2]=N",\n',
'"azole": "[$([nr5]:[nr5,or5,sr5]),$([nr5]:[cr5]:[nr5,or5,sr5])]",\n',
'"azoxy nitrogen": "[$([NX2]=[NX3+]([O-])[#6]),$([NX2]=[NX3+0](=[O])[#6])]",\n',
'"diazene": "[NX2]=[NX2]",\n',
'"diazo nitrogen": "[$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]",\n',
'"bromine": "[Br]",\n']
dd = eval('{' + ' '.join(ll).replace('\n', '') + '}')
This converts your list to a single string, removes the \n
and adds the curly braces, you then have a str that can be evaluated as it's valid python code to form a dictionary.这会将您的列表转换为单个字符串,删除\n
并添加大括号,然后您将拥有一个可以评估的 str,因为它是有效的 python 代码以形成字典。
This is just a problem of formatting or more precisely data cleaning.这只是格式化或更准确地说是数据清理的问题。 I am not sure why you are using an increment variable.我不确定您为什么使用增量变量。 The foremost thing I will like to handle is the newline character at the end of each element, then split it based on ': ' and create a dictionary using the values.我要处理的最重要的事情是每个元素末尾的换行符,然后根据“:”拆分它并使用这些值创建一个字典。 You can try the code below.你可以试试下面的代码。
d = {}
for element in lines:
element = element.rstrip(",\n")
key, value = element.split(": ")
d[key.strip('"')] = value.strip('"')
d
I have used to strip('"') to remove multiple quotation marks.我曾经使用 strip('"') 删除多个引号。
Each element in the list is a string ending in ',\n'.列表中的每个元素都是一个以 ',\n' 结尾的字符串。 These should be removed.这些应该被删除。 The keys and values have unnecessary double-quotes.键和值有不必要的双引号。 These should also be removed.这些也应该被删除。 I think this should give you what you need:我认为这应该给你你需要的东西:
mylist = ['"acetic anydride": "[CX3](=[OX1])[OX2][CX3](=[OX1])",\n',
'"acetylenic carbon": "[$([CX2]#C)]",\n',
'"acyl bromide": "[CX3](=[OX1])[Br]",\n',
'"acyl chloride": "[CX3](=[OX1])[Cl]",\n',
'"acyl fluoride": "[CX3](=[OX1])[F]",\n',
'"acyl iodide": "[CX3](=[OX1])[I]",\n',
'"aldehyde": "[CX3H1](=O)[#6]",\n',
'"alkane": "[CX4]",\n',
'"allenic carbon": "[$([CX2](=C)=C)]",\n',
'"amide": "[NX3][CX3](=[OX1])[#6]",\n',
'"amidium": "[NX3][CX3]=[NX3+]",\n',
'"amino acid": "[$([NX3H2,NX4H3+]),$([NX3H](C)(C))][CX4H]([*])[CX3](=[OX1])[OX2H,OX1-,N]",\n',
'"azide": "[$(-[NX2-]-[NX2+]#[NX1]),$(-[NX2]=[NX2+]=[NX1-])]",\n',
'"azo nitrogen": "[NX2]=N",\n',
'"azole": "[$([nr5]:[nr5,or5,sr5]),$([nr5]:[cr5]:[nr5,or5,sr5])]",\n',
'"azoxy nitrogen": "[$([NX2]=[NX3+]([O-])[#6]),$([NX2]=[NX3+0](=[O])[#6])]",\n',
'"diazene": "[NX2]=[NX2]",\n',
'"diazo nitrogen": "[$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]",\n',
'"bromine": "[Br]",\n']
mydict = dict()
for e in mylist:
t = e.replace('"', '').split(':')
mydict[t[0]] = t[1][:-2].strip()
print(mydict)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.