简体   繁体   中英

Python's function readlines(n) behavior

I've read the documentation , but what does readlines(n) do? By readlines(n) , I mean readlines(3) or any other number.

When I run readlines(3) , it returns same thing as readlines() .

The optional argument should mean how many (approximately) bytes are read from the file. The file will be read further, until the current line ends:

readlines([size]) -> list of strings, each a line from the file.

Call readline() repeatedly and return a list of the lines so read.
The optional size argument, if given, is an approximate bound on the
total number of bytes in the lines returned.

Another quote:

If given an optional parameter sizehint , it reads that many bytes from the file and enough more to complete a line, and returns the lines from that.

You're right that it doesn't seem to do much for small files, which is interesting:

In [1]: open('hello').readlines()
Out[1]: ['Hello\n', 'there\n', '!\n']

In [2]: open('hello').readlines(2)
Out[2]: ['Hello\n', 'there\n', '!\n']

One might think it's explained by the following phrase in the documentation:

Read until EOF using readline() and return a list containing the lines thus read. If the optional sizehint argument is present, instead of reading up to EOF, whole lines totalling approximately sizehint bytes (possibly after rounding up to an internal buffer size) are read. Objects implementing a file-like interface may choose to ignore sizehint if it cannot be implemented, or cannot be implemented efficiently.

However, even when I try to read the file without buffering, it doesn't seem to change anything, which means some other kind of internal buffer is meant:

In [4]: open('hello', 'r', 0).readlines(2)
Out[4]: ['Hello\n', 'there\n', '!\n']

On my system, this internal buffer size seems to be around 5k bytes / 1.7k lines:

In [1]: len(open('hello', 'r', 0).readlines(5))
Out[1]: 1756

In [2]: len(open('hello', 'r', 0).readlines())
Out[2]: 28080

Depending on the size of the file, readlines(hint) should return a smaller set of lines. From the documentation:

f.readlines() returns a list containing all the lines of data in the file. 
If given an optional parameter sizehint, it reads that many bytes from the file 
and enough more to complete a line, and returns the lines from that. 
This is often used to allow efficient reading of a large file by lines, 
but without having to load the entire file in memory. Only complete lines 
will be returned.

So, if your file has 1000s of lines, you can pass in say... 65536, and it will only read up to that many bytes at a time + enough to complete the next line, returning all the lines that are completely read.

It lists the lines , through which the given character size 'n' spans starting from the current line.

Ex: In a text file, with content of

one
two
three
four

open('text').readlines(0) returns ['one\n', 'two\n', 'three\n', 'four\n']

open('text').readlines(1) returns ['one\n']

open('text').readlines(3) returns ['one\n']

open('text').readlines(4) returns ['one\n', 'two\n']

open('text').readlines(7) returns ['one\n', 'two\n']

open('text').readlines(8) returns ['one\n', 'two\n', 'three\n']

open('text').readlines(100) returns ['one\n', 'two\n', 'three\n', 'four\n']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM