简体   繁体   中英

In python, what is a functional, and memory efficient way to read standard in, line by line?

I have this https://stackoverflow.com/a/1450396/1810962 answer from another post which almost achieves it:

import sys
data = sys.stdin.readlines()
preProcessed = map(lambda line: line.rstrip(), data)

I can now operate on the lines in data in a functional way by applying filter, map, etc. However, it loads the entire standard in into memory. Is there a lazy way to build a stream of lines?

Just iterate on sys.stdin , it will iterate on the lines.

Then, you can stack generator expressions, or use map and filter if you prefer. Each line that gets in will go through the pipeline, no list gets built in the process.

Here are examples of each:

import sys

stripped_lines = (line.strip() for line in sys.stdin)
lines_with_prompt = ('--> ' + line for line in stripped_lines)
uppercase_lines = map(lambda line: line.upper(), lines_with_prompt)
lines_without_dots = filter(lambda line: '.' not in line, uppercase_lines)

for line in lines_without_dots:
    print(line)

And in action, in the terminal:

thierry@amd:~$ ./test.py 
My first line
--> MY FIRST LINE 
goes through the pipeline
--> GOES THROUGH THE PIPELINE
but not this one, filtered because of the dot. 
This last one will go through
--> THIS LAST ONE WILL GO THROUGH

A shorter example with map only, where map will iterate on the lines of stdin :

import sys

uppercase_lines = map(lambda line: line.upper(), sys.stdin)

for line in uppercase_lines:
    print(line)

In action:

thierry@amd:~$ ./test2.py 
this line will turn
THIS LINE WILL TURN

to uppercase
TO UPPERCASE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM