我什么时候应该使用 file.read() 或 file.readlines()？

Question

我注意到，如果我遍历我打开的文件，在不“读取”它的情况下遍历它会快得多。

IE

l = open('file','r')
for line in l:
    pass (or code)

比

l = open('file','r')
for line in l.read() / l.readlines():
    pass (or code)

第二个循环将花费大约 1.5 倍的时间（我在完全相同的文件上使用了 timeit，结果是 0.442 与 0.660），并且会给出相同的结果。

所以 - 我什么时候应该使用 .read() 或 .readlines()？

因为我总是需要遍历我正在阅读的文件，并且在艰难地学习了 .read() 在大数据上的缓慢程度之后 - 我似乎无法想象再次使用它。

Answer 1

对您的问题的简短回答是，这三种读取文件位的方法中的每一种都有不同的用例。 如上所述， f.read()将文件作为单独的字符串读取，因此允许相对简单的文件范围操作，例如文件范围的正则表达式搜索或替换。

f.readline()读取文件的单行，允许用户解析单行而不必读取整个文件。 使用f.readline()还允许在读取文件时更容易应用逻辑，而不是完整的逐行迭代，例如当文件在中途更改格式时。

使用for line in f:语法允许用户按照问题中的说明逐行迭代文件。

（如另一个答案中所述，该文档非常好读）：

https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects

注意：之前有人声称f.readline()可用于在 for 循环迭代期间跳过一行。 但是，这在 Python 2.7 中不起作用，并且可能是一种有问题的做法，因此此声明已被删除。

Answer 2

希望这有帮助！

https://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects

当 size 省略或为负时，将读取并返回文件的全部内容； 如果文件是机器内存的两倍大，那是你的问题

对不起所有的编辑！

要从文件中读取行，您可以遍历文件对象。 这是内存高效，快速，并导致简单的代码：

for line in f:
    print line,

This is the first line of the file.
Second line of the file

Answer 3

请注意， readline()无法与在 for 循环中读取所有行的情况相比，因为它逐行读取并且其他人已经指出了开销。

我在两个相同的片段上运行timeit ，但一个使用 for-loop ，另一个使用readlines() 。 你可以在下面看到我的片段：

  
def test_read_file_1():  
    f = open('ml/README.md', 'r')  
    for line in f.readlines():  
        print(line)  
  
  
def test_read_file_2():  
    f = open('ml/README.md', 'r')  
    for line in f:  
        print(line)  
  
  
def test_time_read_file():  
    from timeit import timeit  
  
    duration_1 = timeit(lambda: test_read_file_1(), number=1000000)  
    duration_2 = timeit(lambda: test_read_file_2(), number=1000000)  
  
    print('duration using readlines():', duration_1)  
    print('duration using for-loop:', duration_2)

结果：

duration using readlines(): 78.826229238
duration using for-loop: 69.487692794

我想说的底线是，for 循环更快，但如果两者都有可能，我宁愿使用readlines() 。

Answer 4

当您知道您感兴趣的数据从例如第二行开始时， readlines()比for line in file更好。 您可以简单地编写readlines()[1:] 。

这种用例是当您有一个制表符/逗号分隔值文件并且第一行是标题（并且您不想为 tsv 或 csv 文件使用其他模块时）。

Answer 5

#The difference between file.read(), file.readline(), file.readlines()
file = open('samplefile', 'r')
single_string = file.read()    #Reads all the elements of the file 
                               #into a single string(\n characters might be included)
line = file.readline()         #Reads the current line where the cursor as a string 
                               #is positioned and moves to the next line
list_strings = file.readlines()#Makes a list of strings

Answer 6

电子书

那是一个绝妙的答案。 / 值得一提的是，每当您使用 readline() 函数时，它都会读取一行..... 然后它将无法再次读取。 您可以使用seek()函数返回该位置。 要回到零位置，只需输入f.seek(0) 。

同样，函数f.tell()会让你知道你在哪个位置。

我什么时候应该使用 file.read() 或 file.readlines()？

问题描述

6 个解决方案

解决方案1
33 已采纳 2016-06-29 16:51:51

解决方案2
2 2016-06-29 16:47:14

解决方案3
0 2020-09-29 15:27:09

解决方案4
0 2021-03-09 14:38:12

解决方案5
0 2021-12-01 17:41:49

解决方案6
-4 2018-05-14 12:00:47

我什么时候应该使用 file.read() 或 file.readlines()？

问题描述

6 个解决方案

解决方案1 33 已采纳 2016-06-29 16:51:51

解决方案2 2 2016-06-29 16:47:14

解决方案3 0 2020-09-29 15:27:09

解决方案4 0 2021-03-09 14:38:12

解决方案5 0 2021-12-01 17:41:49

解决方案6 -4 2018-05-14 12:00:47

解决方案1
33 已采纳 2016-06-29 16:51:51

解决方案2
2 2016-06-29 16:47:14

解决方案3
0 2020-09-29 15:27:09

解决方案4
0 2021-03-09 14:38:12

解决方案5
0 2021-12-01 17:41:49

解决方案6
-4 2018-05-14 12:00:47