简体   繁体   中英

Why throw ValueError with int() builtin reading parts of lines from .txt file?

This is a subroutine which reads from studentNamesfile.txt

def calculate_average():
'''Calculates and displays average mark.'''
test_results_file = open('studentNamesfile.txt', 'r')

total = 0
num_recs = 0
line = ' '
while line != '':
    line = test_results_file.readline()
    # Convert everything after the delimiting pipe character to an integer, and add it to total.
    total += int(line[line.find('|') + 1:])
    num_recs += 1
test_results_file.close()

[ num_recs holds the number of records read from the file.]

The format of studentNamesfile.txt is as follows:

Student 01|10
Student 02|20
Student 03|30

and so on. This subroutine is designed to read the mark for all the student records in the file, but I get this error when it runs:

Traceback (most recent call last):
  File "python", line 65, in <module>
  File "python", line 42, in calculate_average
ValueError: invalid literal for int() with base 10: ''

This error is pretty explicit, but I can't figure out why it's being thrown. I tried tracing the value of line[line.find('|') + 1:] , but Python insists it has the correct value (eg 10) when I use print(line[line.find('|') + 1:] on the previous line. What's wrong?

Update : I'm considering the possibility that line[line.find('|') + 1:] includes the newline, which is breaking int() . But using line[line.find('|') + 1:line.find('\\\\')] doesn't fix the problem - the same error is thrown.

Because it's not a numeric value. So, python throws the ValueError if it is not able convert it into integer. You can below code to check it.

def calculate_average():
  test_results_file = open('studentNamesfile.txt', 'r')
  total = 0
  num_recs = 0
  for line in test_results_file.readlines():
    try:
        total += int(line[line.find('|') + 1:])
        num_recs += 1
    except ValueError:
        print("Invalid Data: ", line[line.find('|') + 1:])
  test_results_file.close()
  print("total:", total)
  print("num_recs:", num_recs)
  print("Average:", float(total)/num_recs)

readlines vs readline

from io import StringIO
s = 'hello\n hi\n how are you\n'
f = StringIO(unicode(s))
l = f.readlines()
print(l)
# OUTPUT: [u'hello\n', u' hi\n', u' how are you\n']

f = StringIO(unicode(s)) 
l1 = f.readline()
# u'hello\n'
l2 = f.readline()
# u' hi\n'
l3 = f.readline()
# u' how are you\n'
l4 = f.readline()
# u''
l5 = f.readline()
# u''

readlines

If we use readlines then it will return a list based on \\n character.

readline

From above code we can see that we have only 3 lines in the stringIO but when we access readline it will always gives us an empty string. so, in your code you are converting it into an integer because of that you are getting the ValueError exception.

Here:

while line != '':
    line = test_results_file.readline()

When you hit the end of the file, .readline() returns an empty string, but since this happens after the while line != '' test, you still try to process this line.

The canonical (and much simpler) way to iterate over a file line by line, which is to, well, iterate over the file, would avoid this problem:

for line in test_result_file:
    do_something_with(line)

You'll just have to take care of calling .rstrip() on line if you want to get rid of the ending newline character (which is the case for your code).

Also, you want to make sure that the file is properly closed whatever happens. The canonical way is to use open() as a context manager:

with open("path/to/file.txt") as f:
    for line in test_result_file:
        do_something_with(line)

This will call f.close() when exiting the with block, however it's exited (whether the for loop just finished or an exception happened).

Also, instead of doing complex computation to find the part after the pipe, you can just split your string:

for line in test_results_file:
    total = int(line.strip().split("|")[1])
     num_recs += 1

And finally, you could use the stdlib's csv module to parse your file instead of doing it manually...

A simpler approach.

Demo:

total = 0
num_recs = 0

with open(filename) as infile:                            #Read File
    for line in infile:                                   #Iterate Each line
        if "|" in line:                                   #Check if | in line
            total += int(line.strip().split("|")[-1])     #Extract value and sum
            num_recs += 1
print(total, num_recs)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM