简体   繁体   中英

Python creating a dictionary from a csv file

I am trying to create a data dictionary from a csv file and am having some trouble. I have successfully been able to make a dictionary from two lists in my program below is my code:

playerRank = [[tournamentResults[i],rankingPoints[8]] for i in range(0,len(tournamentResults))]
dict1 = dict(playerRank)

However, when I attempt to make a dictionary out of data I have in a csv file I get an error 'TypeError: unhashable type: 'list''. Below is the code I tried:

totalRank = []
with open("mycsvfile.csv") as players:
    for row in csv.reader(players):
        totalRank.append(row)
    totalRank = [[totalRank[i],0] for i in range(0,len(totalRank))]
dict2 = dict(totalRank)

I don't understand why the second attempt at making a dictionary is throwing back the error whereas, the first dictionary is fine? Any help on how I could resolve this would be greatly appreciated!

The problem is that, as the error message says, lists are not hashable, which means you can't use them as dict keys.

In fact, the reason lists aren't hashable is to prevent you from using them as dict keys. Lists are mutable, and if you mutate a key in a dict, lookup won't work anymore. (Technically, you could get around this by using a hash function based on object identity, instead of the contained values—but then either == wouldn't be useful, or it wouldn't line up with hash .)

The usual solution is pretty simple: a tuple is just like a dict, except immutable. So if your keys are lists, instead of this:

[[key, value] for ...]

… you do this:

[[tuple(key), value] for ...]

And now, you can pass it to dict and everything works.

Of course this assumes that you don't want to mutate those sequences after creating them.


Meanwhile, I'm not sure why you want to use a sequence of values in the first place here, and you may in fact want something simpler. Your totalRank is a list of rows , and you probably only want one column in each row, not the whole thing. In that case, just do this:

[[totalRank[0], 0] for ...]

Or, alternatively, instead of totalRank.append(row) , do totalRank.append(row[0]) .

(I'm assuming it's the first column you want here; obviously you can do row[3] or whatever if you want a different one.)


While we're at it, if you're using Python 3, or Python 2.7, you can write this more readably (and efficiently, too) using a dictionary comprehension instead of a list comprehension. Instead of this:

lst = [[key, value] for ...]
dct = dict(lst)

… just do this:

dct = {key: value for ...}

Also you don't have to loop over i in range(len(lst)) if the only thing you're using i for is lst[i] ; just loop over element in lst .

Putting it all together:

dct2 = {tuple(rank): 0 for rank in totalRank}

… or, depending on what you wanted:

dct2 = {rank[0]: 0 for rank in totalRank}

And one more improvement. This:

totalRank = []
for row in csv.reader(players):
    totalRank.append(row)

Is just a verbose way of writing this:

totalRank = list(csv.reader(players))

Or, if you wanted just the first column:

totalRank = [row[0] for row in csv.reader(players))

So we can reduce your entire loop to this:

with open("mycsvfile.csv") as players:
    dict2 = {tuple(row): 0 for row in csv.reader(players)}

… or, again:

with open("mycsvfile.csv") as players:
    dict2 = {row[0]: 0 for row in csv.reader(players)}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM