重復寫入stdin並從python中讀取進程的stdout

Question

我有一段fortran代碼從STDIN讀取一些數字並將結果寫入STDOUT。 例如：

do
  read (*,*) x
  y = x*x
  write (*,*) y
enddo

所以我可以從shell啟動程序並獲得以下輸入 / 輸出序列：

5
25.0
2.5
6.25

現在我需要在python中執行此操作。 在與subprocess.Popen徒手摔跤並瀏覽本網站上的舊問題之后，我決定使用pexpect.spawn：

import pexpect, os
p = pexpect.spawn('squarer')
p.setecho(False)
p.write("2.5" + os.linesep)
res = p.readline()

它的工作原理。 問題是，我需要在python和我的fortran程序之間傳遞的實際數據是100,000（或更多）雙精度浮點數組。 如果它們包含在一個名為x的數組中，那么

p.write(' '.join(["%.10f"%k for k in x]) + os.linesep)

使用pexpect的以下錯誤消息超時：

buffer (last 100 chars):   
before (last 100 chars):   
after: <class 'pexpect.TIMEOUT'>  
match: None  
match_index: None  
exitstatus: None
flag_eof: False
pid: 8574
child_fd: 3
closed: False
timeout: 30
delimiter: <class 'pexpect.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

除非x少於303個元素。 有沒有辦法將大量數據傳入/傳出另一個程序的STDIN / STDOUT？

我曾嘗試將數據分成更小的塊，但后來我的速度損失很大。

提前致謝。

Answer 1

使用子進程模塊找到了一個解決方案，所以如果有人需要做同樣的事情我在這里發布它以供參考。

import subprocess as sbp

class ExternalProg:

    def __init__(self, arg_list):
        self.opt = sbp.Popen(arg_list, stdin=sbp.PIPE, stdout=sbp.PIPE, shell=True, close_fds=True)

    def toString(self,x):
        return ' '.join(["%.12f"%k for k in x])

    def toFloat(self,x):
        return float64(x.strip().split())

    def sendString(self,string):
        if not string.endswith('\n'):
            string = string + '\n'
        self.opt.stdin.write(string)

    def sendArray(self,x):
        self.opt.stdin.write(self.toString(x)+'\n')

    def readInt(self):
        return int(self.opt.stdout.readline().strip())

    def sendScalar(self,x):
        if type(x) == int:
            self.opt.stdin.write("%i\n"%x)
        elif type(x) == float:
            self.opt.stdin.write("%.12f\n"%x)

    def readArray(self):
        return self.toFloat(self.opt.stdout.readline())

    def close(self):
        self.opt.kill()

使用名為“optimizer”的外部程序調用該類，如下所示：

optim = ExternalProg(['./optimizer'])
optim.sendScalar(500) # send the optimizer the length of the state vector, for example
optim.sendArray(init_x) # the initial guess for x
optim.sendArray(init_g) # the initial gradient g
next_x = optim.readArray() # get the next estimate of x
next_g = evaluateGradient(next_x) # calculate gradient at next_x from within python
# repeat until convergence

在fortran方面（程序編譯為提供可執行文件'optimizer'），將讀入一個500元素的向量：

read(*,*) input_vector(1:500)

並將寫出來：

write(*,'(500f18.11)') output_vector(1:500)

就是這樣！ 我用最多200,000個元素的狀態向量測試了它（這是我現在需要的上限）。 希望這能幫助除我以外的其他人。 這個解決方案適用於ifort和xlf90，但由於某些原因我不理解gfortran。

Answer 2

示例squarer.py程序（它恰好在Python中，使用您的Fortran可執行文件）：

#!/usr/bin/python
import sys
data= sys.stdin.readline() # expecting lots of data in one line
processed_data= data[-2::-1] # reverse without the newline
sys.stdout.write(processed_data+'\n')

示例target.py程序：

import thread, Queue
import subprocess as sbp

class Companion(object):
    "A companion process manager"
    def __init__(self, cmdline):
        "Start the companion process"
        self.companion= sbp.Popen(
            cmdline, shell=False,
            stdin=sbp.PIPE,
            stdout=sbp.PIPE)
        self.putque= Queue.Queue()
        self.getque= Queue.Queue()
        thread.start_new_thread(self._sender, (self.putque,))
        thread.start_new_thread(self._receiver, (self.getque,))

    def _sender(self, que):
        "Actually sends the data to the companion process"
        while 1:
            datum= que.get()
            if datum is Ellipsis:
                break
            self.companion.stdin.write(datum)
            if not datum.endswith('\n'):
                self.companion.stdin.write('\n')

    def _receiver(self, que):
        "Actually receives data from the companion process"
        while 1:
            datum= self.companion.stdout.readline()
            que.put(datum)

    def close(self):
        self.putque.put(Ellipsis)

    def send(self, data):
        "Schedule a long line to be sent to the companion process"
        self.putque.put(data)

    def recv(self):
        "Get a long line of output from the companion process"
        return self.getque.get()

def main():
    my_data= '12345678 ' * 5000
    my_companion= Companion(("/usr/bin/python", "squarer.py"))

    my_companion.send(my_data)
    my_answer= my_companion.recv()
    print my_answer[:20] # don't print the long stuff
    # rinse, repeat

    my_companion.close()

if __name__ == "__main__":
    main()

main函數包含您將使用的代碼：設置Companion對象， companion.send一長串數據， companion.recv一行。 根據需要重復。

Answer 3

這是一個巨大的簡化：將Python分解為兩件事。

python source.py | squarer | python sink.py

squarer應用程序是您的Fortran代碼。 從stdin讀取，寫入stdout。

你的source.py就是你的Python

import sys
sys.stdout.write(' '.join(["%.10f"%k for k in x]) + os.linesep)

或者，或許更簡單一些，即

from __future__ import print_function
print( ' '.join(["{0:.10f}".format(k) for k in x]) )

你的sink.py是這樣的。

import fileinput
for line in fileinput.input():
    # process the line

分離源，平方和接收器可以獲得3個獨立的進程（而不是2個），並將使用更多內核。 更多核心==更多並發==更有趣。

Answer 4

我認為你只在這里添加一個換行符：

p.write(' '.join(["%.10f"%k for k in x]) + os.linesep)

而不是每行添加一個。

Answer 5

看起來你超時（默認超時，我相信，30秒），因為准備，發送，接收和處理大量數據需要花費大量時間。 根據文檔， timeout=是expect方法的可選命名參數，您沒有調用它 - 可能有一種未記錄的方法來設置初始化程序中的默認超時，這可以通過對源進行仔細查找（或者，最壞的情況，通過黑客攻擊來創建）。

如果Fortran程序一次讀取並保存（比方說）100個項目，並且有提示，則同步將變得非常容易。 您可以為此目的修改Fortran代碼，還是寧願選擇無證/黑客方法？

重復寫入stdin並從python中讀取進程的stdout

問題描述

5 個解決方案

解決方案1
6 2010-12-03 09:00:45

解決方案2
2 2010-09-16 13:17:04

解決方案3
1 2010-08-17 15:33:53

解決方案4
0 2010-08-17 15:09:15

解決方案5
0 2010-08-17 15:12:01

重復寫入stdin並從python中讀取進程的stdout

問題描述

5 個解決方案

解決方案1 6 2010-12-03 09:00:45

解決方案2 2 2010-09-16 13:17:04

解決方案3 1 2010-08-17 15:33:53

解決方案4 0 2010-08-17 15:09:15

解決方案5 0 2010-08-17 15:12:01

解決方案1
6 2010-12-03 09:00:45

解決方案2
2 2010-09-16 13:17:04

解決方案3
1 2010-08-17 15:33:53

解決方案4
0 2010-08-17 15:09:15

解決方案5
0 2010-08-17 15:12:01