为什么用内置int（）读取.txt文件中的行的一部分抛出ValueError？

Question

This is a subroutine which reads from studentNamesfile.txt 这是从studentNamesfile.txt中读取的子例程

def calculate_average():
'''Calculates and displays average mark.'''
test_results_file = open('studentNamesfile.txt', 'r')

total = 0
num_recs = 0
line = ' '
while line != '':
    line = test_results_file.readline()
    # Convert everything after the delimiting pipe character to an integer, and add it to total.
    total += int(line[line.find('|') + 1:])
    num_recs += 1
test_results_file.close()

[ num_recs holds the number of records read from the file.] [ num_recs保存从文件中读取的记录数。]

The format of studentNamesfile.txt is as follows: studentNamesfile.txt的格式如下：

Student 01|10
Student 02|20
Student 03|30

and so on. 等等。 This subroutine is designed to read the mark for all the student records in the file, but I get this error when it runs: 该子例程旨在读取文件中所有学生记录的标记，但是在运行时出现此错误：

Traceback (most recent call last):
  File "python", line 65, in <module>
  File "python", line 42, in calculate_average
ValueError: invalid literal for int() with base 10: ''

This error is pretty explicit, but I can't figure out why it's being thrown. 这个错误是很明显的，但是我不知道为什么会抛出它。 I tried tracing the value of line[line.find('|') + 1:] , but Python insists it has the correct value (eg 10) when I use print(line[line.find('|') + 1:] on the previous line. What's wrong? 我尝试跟踪line[line.find('|') + 1:] ，但是Python坚持使用print(line[line.find('|') + 1:] line[line.find('|') + 1:]它具有正确的值（例如10）。 print(line[line.find('|') + 1:]在上一行，怎么了？

Update : I'm considering the possibility that line[line.find('|') + 1:] includes the newline, which is breaking int() . 更新：我正在考虑line[line.find('|') + 1:]包含换行符的可能性，这会破坏int() 。 But using line[line.find('|') + 1:line.find('\\\\')] doesn't fix the problem - the same error is thrown. 但是使用line[line.find('|') + 1:line.find('\\\\')]不能解决问题-引发相同的错误。

Answer 1

Because it's not a numeric value. 因为它不是数字值。 So, python throws the ValueError if it is not able convert it into integer. 因此，如果python无法将其转换为整数，则抛出ValueError 。 You can below code to check it. 您可以在下面的代码中进行检查。

def calculate_average():
  test_results_file = open('studentNamesfile.txt', 'r')
  total = 0
  num_recs = 0
  for line in test_results_file.readlines():
    try:
        total += int(line[line.find('|') + 1:])
        num_recs += 1
    except ValueError:
        print("Invalid Data: ", line[line.find('|') + 1:])
  test_results_file.close()
  print("total:", total)
  print("num_recs:", num_recs)
  print("Average:", float(total)/num_recs)

readlines vs readline 阅读线与阅读线

from io import StringIO
s = 'hello\n hi\n how are you\n'
f = StringIO(unicode(s))
l = f.readlines()
print(l)
# OUTPUT: [u'hello\n', u' hi\n', u' how are you\n']

f = StringIO(unicode(s)) 
l1 = f.readline()
# u'hello\n'
l2 = f.readline()
# u' hi\n'
l3 = f.readline()
# u' how are you\n'
l4 = f.readline()
# u''
l5 = f.readline()
# u''

readlines readlines方法

If we use readlines then it will return a list based on \\n character. 如果我们使用readlines ，它将返回一个基于\\n字符的列表。

readline 的ReadLine

From above code we can see that we have only 3 lines in the stringIO but when we access readline it will always gives us an empty string. 从上面的代码中，我们可以看到stringIO只有3行，但是当我们访问readline ，它将始终为我们提供一个空字符串。 so, in your code you are converting it into an integer because of that you are getting the ValueError exception. 因此，在您的代码中，您将其转换为整数，因为您遇到了ValueError异常。

Answer 2

Here: 这里：

while line != '':
    line = test_results_file.readline()

When you hit the end of the file, .readline() returns an empty string, but since this happens after the while line != '' test, you still try to process this line. 当您到达文件末尾时， .readline()返回一个空字符串，但是由于这是在 while line != ''测试之后发生的，因此您仍然尝试处理此行。

The canonical (and much simpler) way to iterate over a file line by line, which is to, well, iterate over the file, would avoid this problem: 逐行遍历文件的规范（且更为简单）的方法（即遍历文件）可以避免此问题：

for line in test_result_file:
    do_something_with(line)

You'll just have to take care of calling .rstrip() on line if you want to get rid of the ending newline character (which is the case for your code). 如果您想摆脱结尾的换行符（您的代码就是这种情况.rstrip() ， .rstrip()需要在line调用.rstrip() ）即可。

Also, you want to make sure that the file is properly closed whatever happens. 另外，无论发生什么情况，您都希望确保文件已正确关闭。 The canonical way is to use open() as a context manager: 规范的方法是使用open()作为上下文管理器：

with open("path/to/file.txt") as f:
    for line in test_result_file:
        do_something_with(line)

This will call f.close() when exiting the with block, however it's exited (whether the for loop just finished or an exception happened). 退出with块时，它将调用f.close() ，但是它已退出（无论for循环刚刚结束还是发生了异常）。

Also, instead of doing complex computation to find the part after the pipe, you can just split your string: 此外，您无需拆分复杂的计算就可以找到位于管道后的零件，而只需拆分字符串即可：

for line in test_results_file:
    total = int(line.strip().split("|")[1])
     num_recs += 1

And finally, you could use the stdlib's csv module to parse your file instead of doing it manually... 最后，您可以使用stdlib的csv模块来解析文件，而无需手动进行操作...

Answer 3

A simpler approach. 一种更简单的方法。

Demo: 演示：

total = 0
num_recs = 0

with open(filename) as infile:                            #Read File
    for line in infile:                                   #Iterate Each line
        if "|" in line:                                   #Check if | in line
            total += int(line.strip().split("|")[-1])     #Extract value and sum
            num_recs += 1
print(total, num_recs)

为什么用内置int（）读取.txt文件中的行的一部分抛出ValueError？

问题描述

3 个解决方案

解决方案1
1 2018-09-26 11:07:49

readlines vs readline 阅读线与阅读线

readlines readlines方法

readline 的ReadLine

解决方案2
1 已采纳 2018-09-26 11:36:09

解决方案3
0 2018-09-26 11:27:12

为什么用内置int（）读取.txt文件中的行的一部分抛出ValueError？

问题描述

3 个解决方案

解决方案1 1 2018-09-26 11:07:49

readlines vs readline 阅读线与阅读线

readlines readlines方法

readline 的ReadLine

解决方案2 1 已采纳 2018-09-26 11:36:09

解决方案3 0 2018-09-26 11:27:12

解决方案1
1 2018-09-26 11:07:49

解决方案2
1 已采纳 2018-09-26 11:36:09

解决方案3
0 2018-09-26 11:27:12