简体   繁体   English

从字符串列表创建字典

[英]Creating a dictionary from a list of strings

I have a list of strings 我有一个字符串清单

list = ['2(a)', '2(b)', '3', '3(a)', '1d', '5']

where it is intentional that the 1d, 3, and 5 don't involve parentheses. 故意使1d,3和5不包含括号。

I would like to create a dictionary which looks like this: 我想创建一个像这样的字典:

dict = {'2': 'a', '2': 'b', '3': 'a', '1': 'd'}

or 要么

dict = {'2': ['a', 'b'], '3': ['a'], '1': ['d']}.

Essentially, ignore those strings without a letter az. 本质上,忽略那些不带字母az的字符串。 I've used regular expressions to extract from the top list the following: 我使用正则表达式从顶部列表中提取以下内容:

['a', 'b', 'a', 'd'],

but this hasn't helped me much in forming my dictionary easily. 但这对我轻松地编写字典没有太大帮助。

Any help is much appreciated. 任何帮助深表感谢。

Since a dictionary can't contain duplicate keys, use a defaultdict : 由于字典不能包含重复的键,请使用defaultdict

import collections
l = ['2(a)', '2(b)', '3', '3(a)', '1c', '5']
d = collections.defaultdict(list)
for item in l:
    num = ''.join(c for c in item if c.isdigit())
    word = ''.join(c for c in item if c.isalpha())
    if word and num:
        d[num].append(word)

Result: 结果:

>>> print(d)
defaultdict(<class 'list'>, {'2': ['a', 'b'], '1': ['c'], '3': ['a']})

This is a good time to use setdefault() for a dictionary to define the structure of your dictionary. 这是将setdefault()用于字典以定义字典结构的好时机。 The first part involves capturing the numbers from the elements using a regex that captures all numbers. 第一部分涉及使用捕获所有数字的正则表达式从元素捕获数字。 That list is then concatenated using join() . 然后使用join()连接该list

We then extract only alphabet characters using either a list comprehension -> [j for j in i if j.isalpha()] , or pass as a generator j for j in i if j.isalpha() ( generator in our case, joining the elements as a string together once again ). 然后,我们只使用列表 [j for j in i if j.isalpha()] -> [j for j in i if j.isalpha()]则提取字母字符,或者j for j in i if j.isalpha() 生成器 j for j in i if j.isalpha()传递( 在本例中为generator, 再次 将这些元素作为 string 合并在一起 )。

Lastly a check to see that both key and value exist so that we can set our dictionary to be of this format -> { '' : [] , ...} 最后检查一下是否存在keyvalue ,以便我们可以将字典设置为以下格式-> { '' : [] , ...}

import re

def to_dict(l):
    d = {}
    for i in l: 
        key = re.findall(r'\d+', i)
        value = ''.join(j for j in i if j.isalpha())
        if key and value:
            d.setdefault(''.join(key), []).append(value)    
    return d

Sample output: 样本输出:

l = ['2(a)', '2(b)', '3', '3(a)', '1c', '5']
print to_dict(l)
>>> {'1': ['c'], '3': ['a'], '2': ['a', 'b']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM