简体   繁体   中英

How to create a dictionary from file?

I want to create a dictionary with values from a file.

The problem is that it would have to be read line by line to be added to the dictionary because I don't think I have enough memory to load in all the information to be appended to the dictionary.

The key can be default but the value will be one selected from each line in the file. The file is not csv but I always split the lines so I can be able to select a value from it.

 import sys

 def prod_check(dirname):
    dict1 = {}
    k = 0
    with open('select_sha_sub_hashes.out') as inf:
       for line in inf:
        pline = line.split('|')
        value = pline[3]
        dict1[line] = dict1[k]
        k += 1
        print dict1

 if __name__ =="__main__":
    dirname=sys.argv[1]
    prod_check(dirname)

This is the code I am working with, and the variable I have set as value is the index in the line from the file which I am pulling data from. I seem to be coming to a problem when I try and call the dictionary to print the values, but I think it may be a problem in my syntax or maybe an assignment I made. I want the values to be added to the keys, but the keys to remain as regular numbers like 0-100

If you don't have enough memory to store the entire dictionary in RAM at once, try anydbm, bsddb and/or gdbm. These are dictionary-like objects that keep key-value pairs on disk in a single-table, keystring-valuestring database.

Optionally, consider: http://stromberg.dnsalias.org/~strombrg/cachedb.html ...which will allow you to transparently convert between serialized and not-serialized representations pretty transparently.

Have a look at something like "Tokyo Cabinet" @ http://fallabs.com/tokyocabinet/ which has Python bindings and is fairly efficient. There's also Kyoto cabinet but the licensing on that is a little restrictive.

Also check out this previous S/O post: Reliable and efficient key--value database for Linux?

So it sounds as if the main problem is reading the file line-by-line. To read a file line-by-line you can do this:

with open('data.txt') as inf:
   for line in inf:
       # do your rest of processing

The advantage of using with is that the file is closed for you automagically when you are done or an exception occurs.

--

Note, the original post didn't contain any code, it now seems to have incorporated a copy of this code to help further explain the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM