简体   繁体   中英

Python: reading line from file with different types of variables

Trying to analyze a 2 column (color number_of_occurances) .tsv file that has a heading line with a dictionary. Trying to skip the heading line in the most generic way possible (assume this to be by requiring the 2nd column to be of int type). The following is the best I've come up with, but seems like there has to be better:

filelist = []
color_dict = {}
with open('file1.tsv') as F:
    filelist = [line.strip('\n').split('\t') for line in F]
for item in filelist:
    try: #attempt to add values to existing dictionary entry
        x = color_dict[item[0]]
        x += int(item[1])
        color_dict[item[0]] = x
    except: #if color has not been observed yet (KeyError), or if non-convertable string(ValueError) create new entry
        try:
            color_dict[item[0]] = int(item[1])
        except(ValueError): #if item[1] can't convert to int
            pass

Seems like there should be a better way to handle the trys and exceptions.

File excerpt by request:

color Observed
green 15
gold 20
green 35

Can't you just skip the first element in the list by slicing your list as [1:] like this:

filelist = [line.strip('\n').split('\t') for line in F][1:]

Now, fileList won't at all contain the element for first line, ie, the heading line.

Or, as pointed in comment by @StevenRumbalski, you can simply do next(F, None) before your list comprehension to avoid making a copy of your list, after first element like this:

with open('file1.tsv') as F:
    next(F, None)
    filelist = [line.strip('\n').split('\t') for line in F]

Also, it would be better if you use a defaultdict here.

Use it like this:

from collections import defaultdict
color_dict = defaultdict(int)

And this way, you won't have to check for existence of key , before operating on it. So, you can simply do:

color_dict[item[0]] += int(item[1])

I would use defaultdict in this case. Because, when each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created.

 from collections import defaultdict
 color_dict = defaultdict(int)
 for item in filelist:
       color_dict[item[0]] += int(item[1])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM