简体   繁体   English

如何将一个python脚本的输出传递给另一个python脚本

[英]How do I pipe output of one python script to another python script

I got stuck in piping the output of one script into another script (both are python). 我陷入了将一个脚本的输出传递到另一个脚本(两个都是python)的问题。

This question is very similar but (1) it does not provide an answer (2) there is a slight difference in mine. 这个问题非常相似,但(1)它没有提供答案(2)我的情况略有不同。 So, I thought opening a new question would be better. 所以,我认为开一个新问题会更好。

Here is the problem. 这是问题所在。
Both scripts are almost identical: 两个脚本几乎完全相同:

receiver.py receiver.py

import sys
import time

for line in sys.stdin:
    sys.stdout.write(line)
    sys.stdout.flush()
    time.sleep(3)

replicator.py replicator.py

import sys
import time

for line in sys.stdin:
    sys.stderr.write(line)
    sys.stderr.flush()
    time.sleep(3)

When I am executing these scripts in bash or cmd one by one, everything is fine. 当我一个接一个地在bashcmd中执行这些脚本时,一切都很好。 Both examples below are working and I see the input text in the output: 以下两个示例都正常工作,我在输出中看到输入文本:

Works: (One line of output appears each 3 seconds) 作品:(每3秒输出一行输出)

cat data.txt | python receiver.py
cat data.txt | python replicator.py

But once I pipe from one script to another script they stop working: 但是一旦我从一个脚本管道到另一个脚本,它们就会停止工作:

Doesn't work: (Nothing appears until the end of file is being reached) 不起作用:(在到达文件末尾之前没有任何内容出现)

cat data.txt | python receiver.py | python replicator.py

Then when I pipe the first script to another tool it works again! 然后,当我将第一个脚本传递给另一个工具时,它再次工作!

Works: 作品:

cat data.txt | python receiver.py | cat -n
cat data.txt | python replicator.py | cat -n

And finally when I remove the blocking sleep() function it starts to work again: 最后,当我删除阻塞sleep()函数时,它再次开始工作:

Removing the timer: 删除计时器:

time.sleep(0)

Now it works: 它现在有效:

cat data.txt | python receiver.py | python replicator.py

Does anybody know what is wrong with my piping? 有人知道我的管道有什么问题吗? I am not looking for alternative ways to do it. 我不是在寻找替代方法。 I just want to learn what is happening here. 我只想了解这里发生的事情。

UPDATE UPDATE

Based on the comments, I refined the examples. 根据评论,我对这些例子进行了改进。
Now both scripts not only print out the content of data.txt , but also add a time-stamp to each line. 现在这两个脚本不仅打印出data.txt的内容,还为每一行添加时间戳。

receiver.py receiver.py

import sys
import time
import datetime

for line in sys.stdin:
    sys.stdout.write(str(datetime.datetime.now().strftime("%H:%M:%S"))+'\t')
    sys.stdout.write(line)
    sys.stdout.flush()
    time.sleep(1)

data.txt data.txt中

Line-A
Line-B
Line-C
Line-D

The result 结果

$> cat data.txt
Line-A
Line-B
Line-C
Line-D

$> cat data.txt | python receiver.py
09:05:44        Line-A
09:05:45        Line-B
09:05:46        Line-C
09:05:47        Line-D

$> cat data.txt | python receiver.py | python receiver.py
09:05:54        09:05:50        Line-A
09:05:55        09:05:51        Line-B
09:05:56        09:05:52        Line-C
09:05:57        09:05:53        Line-D

$> cat test.log | python receiver.py | sed -e "s/^/$(date +"%H:%M:%S") /"
09:17:55        09:17:55        Line-A
09:17:55        09:17:56        Line-B
09:17:55        09:17:57        Line-C
09:17:55        09:17:58        Line-D

$> cat test.log | python receiver.py | cat | python receiver.py
09:36:21        09:36:17        Line-A
09:36:22        09:36:18        Line-B
09:36:23        09:36:19        Line-C
09:36:24        09:36:20        Line-D

As you see when I am piping the output of python script to itself, the second script waits until the first one is finished. 正如您所看到的那样,当我将python脚本的输出传递给它自己时,第二个脚本会一直等到第一个脚本完成。 Then it starts to digest the data. 然后它开始消化数据。

However, when I am using another tool ( sed in this example), the tool receives the data immediately. 但是,当我使用其他工具(在此示例中为sed )时,该工具会立即接收数据。 Why it is happening? 为什么会这样?

This is due to the internal buffering in File Objects ( for line in sys.stdin ). 这是由于文件对象中 的内部缓冲for line in sys.stdin )。

So, if we fetch line by line : 所以,如果我们逐行获取:

import sys
import time
import datetime

while True:
    line = sys.stdin.readline()
    if not line:
       break
    sys.stdout.write(str(datetime.datetime.now().strftime("%H:%M:%S"))+'\t')
    sys.stdout.write(line)
    sys.stdout.flush()
    time.sleep(1)

The code will work as expected: 代码将按预期工作:

$ cat data.txt | python receiver.py |  python receiver.py
09:43:46        09:43:46        Line-A
09:43:47        09:43:47        Line-B
09:43:48        09:43:48        Line-C
09:43:49        09:43:49        Line-D

Documentation 文档

... Note that there is internal buffering in file.readlines() and File Objects ( for line in sys.stdin ) which is not influenced by this option. ...请注意,file.readlines()和文件对象( 对于sys.stdin中的行 )有内部缓冲,不受此选项的影响。 To work around this, you will want to use file.readline() inside a while 1: loop. 要解决此问题,您需要在while 1:循环中使用file.readline()。

NOTE : The File Object thing was fixed in Python 3 注意File Object东西是在Python 3中修复的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM