简体   繁体   中英

error in exporting a file into a dictionary in python

I have a csv file with two columns and more that 6000 rows and would like to export it to a dictionary in python. here is a part of big file:

ENST00000589805,CCCTCCCGGACTCCTCTCCCCGGCCGGCCGGCAAGAGTTTACAA
ENST00000376512,GTTGCCGAGGGGACGGGCCGGGCAGATGCCAAC
ENST00000314332,TTTAAG

I wrote this function:

def file_to_dict(filename):
    f = open(filename, 'r')
    answer = {}
    for line in f:
        k, v = line.strip().split(',')
        answer[k.strip()] = v.strip()
    return answer

I tried that for a small file and worked perfectly. but when I tried that for my big file, it gave this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in file_to_dict
ValueError: too many values to unpack

I tried to find the solution but did not manage. do you guys know how to resolve it? BTW, the dictionary would be like this:

{'ENST00000589805':'CCCTCCCGGACTCCTCTCCCCGGCCGGCCGGCAAGAGTTTACAA', 'ENST00000376512': 'GTTGCCGAGGGGACGGGCCGGGCAGATGCCAAC', 'ENST00000314332': 'TTTAAG'}

The most likely (but not the only possible) cause is that you have a newline at the end if your input file. This would break the split() call in the manner you describe. One way to fix this is as follows:

for line in f:
    line = line.strip()
    if line:
      k, v = line.split(',')
      answer[k.strip()] = v.strip()

It is equally possible that your input file breaks your assumptions in some other way. To handle this, you should beef up the error checking in your code.

One or more of the lines probably has more than one comma in it. Because you're splitting by commas, it's being broken up into >2 variables, but you've only specified two names. Find the line with the extra comma and try to fix that, or give an extra variable name if needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM