简体   繁体   English

从列表列表创建字典

[英]Creating dictionary from list of lists

I am working on an online course exercise (practice problem before the final test).我正在做一个在线课程练习(期末考试前的练习题)。

The test involves working with a big csv file (not downloadable) and answering questions about the dataset.该测试涉及使用大型 csv 文件(不可下载)并回答有关数据集的问题。 You're expected to write code to get the answers.您应该编写代码来获得答案。 The data set is a list of all documented baby names each year, along with #how often each name was used for boys and for girls.数据集是每年所有记录在案的婴儿名字的列表,以及每个名字用于男孩和女孩的频率。 A sample list of the first 10 lines is also given:还给出了前 10 行的示例列表:

Isabella,42567,Girl
Sophia,42261,Girl
Jacob,42164,Boy

and so on.等等。

Questions you're asked include things like 'how many names in the data set', 'how many boys' names beginning with z' etc.你被问到的问题包括“数据集中有多少名字”、“有多少男孩的名字以 z 开头”等。

I can get all the data into a list of lists:我可以将所有数据放入列表列表中:

[['Isabella', '42567', 'Girl'], ['Sophia', '42261', 'Girl'], ['Jacob', '42164', 'Boy']]

My plan was to convert into a dictionary, as that would probably be easier for answering some of the other questions.我的计划是转换成字典,因为这可能更容易回答一些其他问题。 The list of lists is saved to the variable 'data':列表列表保存到变量“数据”中:

names = {}
for d in data:
    names[d[0]] = d[1:]

print(names)
{'Isabella': ['42567', 'Girl'], 'Sophia': ['42261', 'Girl'], 'Jacob': ['42164', 'Boy']}

Works perfectly.完美运行。

Here's where it gets weird.这就是它变得奇怪的地方。 If instead of opening the sample file with 10 lines, I open the real csv file, with around 16,000 lines.如果不是打开 10 行的示例文件,而是打开真正的 csv 文件,大约有 16,000 行。 everything works perfectly right up to the very last bit.一切正常,直到最后一点。 I get the complete list of lists, but when I go to create the dictionary, it breaks - here I'm just showing the first three items, but the full 16000 lines are all wrong in a similar way):我得到了完整的列表列表,但是当我去创建字典时,它会中断 - 在这里我只显示前三个项目,但完整的 16000 行都以类似的方式错误):

names = {}
for d in data:
    names[d[0]] = d[1:]

print(names)
{'Isabella': ['56', 'Boy'], 'Sophia': ['48', 'Boy'], 'Jacob': ['49', 'Girl']

I know the data is there and correct, since I can read it directly:我知道数据在那里并且是正确的,因为我可以直接读取它:

for d in data:
    print(d[0], d[1], d[2])

Isabella 42567 Girl
Sophia 42261 Girl
Jacob 42164 Boy

Why would this dictionary work fine with the cvs file with 10 lines, but completely break with the full file?为什么这本字典可以与 10 行的 cvs 文件一起正常工作,但与完整文件完全中断? I can't find any我找不到任何

Follow the comments to create two dicts, or a single dictionary with tuple keys.按照注释创建两个字典,或一个带有元组键的字典。 Using tuples as keys is fine if you keep your variables inside python, but you might get into trouble when exporting to json for example.如果您将变量保存在 python 中,则使用元组作为键很好,但是例如在导出到 json 时可能会遇到麻烦。

Try a dictionary comprehension with list unpacking尝试使用列表解包进行字典理解

names = {(name, sex): freq for name, freq, sex in data}

Or a for loop as you started或开始时的 for 循环

names = dict()
for name, freq, sex in data:
    names[(name, freq)] = freq

I'd go with something like我会用类似的东西

results = {}
for d in data:
    name, amount, gender = d.split(',')
    results[name] = data.get(name, {})
    results[name].update({ gender: amount })

this way you'll get results in smth like这样你就会得到类似的结果

{
    'Isabella': {'Girl': '42567', 'Boy': '67'}, 
    'Sophia': {'Girl': '42261'}, 
    'Jacob': {'Boy': '42164'}
}

However duplicated values will override previous, so you need to take that into account if there are some and it also assumes that the whole file matches format you've provided但是,重复的值将覆盖以前的值,因此您需要考虑到是否存在重复值,并且还假定整个文件与您提供的格式匹配

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM