[英]Pass a file from a PHP server to a Python server (HTTP request)
I have a web application running on a Laravel PHP server.我有一个 web 应用程序在 Laravel PHP 服务器上运行。 For some needs (Word document processing), I implemented a Python server that does data extraction.
对于某些需求(Word 文档处理),我实现了一个 Python 服务器进行数据提取。 I would like to know how to call my Python server from PHP by passing a file to it.
我想知道如何通过将文件传递给它来从 PHP 调用我的 Python 服务器。 Currently, I save the docx file on the PHP server, accessible via a url.
目前,我将 docx 文件保存在 PHP 服务器上,可通过 url 访问。 I make an http POST request from the PHP server to the Python server with the URL to download the document.
我从 PHP 服务器向 Python 服务器发出 http POST 请求,并将 ZE6B391A8D2C4D4D4550 下载文档下载到The problem is that I get a deadlock since the PHP server is waiting on the response from the Python server and the Python server is waiting on the PHP server to download the document.
The problem is that I get a deadlock since the PHP server is waiting on the response from the Python server and the Python server is waiting on the PHP server to download the document. Any suggestions on how to get around this problem?
有关如何解决此问题的任何建议?
Here the PHP code:这里的 PHP 代码:
// Send POST REQUEST
$context_options = array(
'http' => array(
'method' => 'POST',
'header' => "Content-type: application/x-www-form-urlencoded\r\n"
. "Content-Length: " . strlen($data) . "\r\n",
'content' => $data,
'timeout' => 10,
)
);
$context = stream_context_create($context_options);
$result = fopen('http://localhost:5000/api/extraction','r', false, $context);
And here the Python code:这里是 Python 代码:
@app.route('/api/extraction', methods=['post'])
def extraction():
data = request.form.to_dict()
url = data['file'] # get url
filename = secure_filename(url.rsplit('/', 1)[-1])
path = os.path.join(app.config['UPLOAD_FILE_FOLDER'], filename)
urllib.request.urlretrieve(url, path)
You should send the file through proper POST (multipart/form) request instead of having Python fetching the data.您应该通过正确的 POST(多部分/表单)请求发送文件,而不是让 Python 获取数据。 It's much harder to debug and maintain than your current 2-roundtrip approach.
与当前的 2 次往返方法相比,它更难调试和维护。
<?php
/**
* A genertor that yields multipart form-data fragments (without the ending EOL).
* Would encode all files with base64 to make the request binary-safe.
*
* @param iterable $vars
* Key-value iterable (e.g. assoc array) of string or integer.
* Keys represents the field name.
* @param iterable $files
* Key-value iterable (e.g. assoc array) of file path string.
* Keys represents the field name of file upload.
*
* @return \Generator
* Generator of multipart form-data fragments (without the ending EOL) in array format,
* always contains 2 values:
* 0 - An array of header for a key-value pair.
* 1 - A value string (can contain binary content) of the key-value pair.
*/
function generate_multipart_data_parts(iterable $vars, iterable $files=[]): Generator {
// handle normal variables
foreach ($vars as $name => $value) {
$name = urlencode($name);
$value = urlencode($value);
yield [
// header
["Content-Disposition: form-data; name=\"{$name}\""],
// value
$value,
];
}
// handle file contents
foreach ($files as $file_fieldname => $file_path) {
$file_fieldname = urlencode($file_fieldname);
$file_data = file_get_contents($file_path);
yield [
// header
[
"Content-Disposition: form-data; name=\"{$file_fieldname}\"; filename=\"".basename($file_path)."\"",
"Content-Type: application/octet-stream", // for binary safety
],
// value
$file_data
];
}
}
/**
* Converts output of generate_multipart_data_parts() into form data.
*
* @param iterable $parts
* An iterator of form fragment arrays. See return data of
* generate_multipart_data_parts().
* @param string|null $boundary
* An optional pre-generated boundary string to use for wrapping data.
* Please reference section 7.2 "The Multipart Content-Type" in RFC1341.
*
* @return array
* An array with 2 items:
* 0 - string boundary
* 1 - string (can container binary data) data
*/
function wrap_multipart_data(iterable $parts, ?string $boundary = null): array {
if (empty($boundary)) {
$boundary = '-----------------------------------------boundary' . time();
}
$data = '';
foreach ($parts as $part) {
list($header, $content) = $part;
// Check content for boundary.
// Note: Won't check header and expect the program makes sense there.
if (strstr($content, "\r\n$boundary") !== false) {
throw new \Exception('Error: data contains the multipart boundary');
}
$data .= "--{$boundary}\r\n";
$data .= implode("\r\n", $header) . "\r\n\r\n" . $content . "\r\n";
}
// signal end of request (note the trailing "--")
$data .= "--{$boundary}--\r\n";
return [$boundary, $data];
}
// build data for a multipart/form-data request
list($boundary, $data) = wrap_multipart_data(generate_multipart_data_parts(
// normal form variables
[
'hello' => 'world',
'foo' => 'bar',
],
// files
[
'upload_file' => 'path/to/your/file.xlsx',
]
));
// Send POST REQUEST
$context_options = array(
'http' => array(
'method' => 'POST',
'header' => "Content-type: multipart/form-data; boundary={$boundary}\r\n"
. "Content-Length: " . strlen($data) . "\r\n",
'content' => $data,
'timeout' => 10,
)
);
$context = stream_context_create($context_options);
$result = fopen('http://localhost:5000/api/extraction','r', false, $context);
Your Python script should receive the file as a normal HTTP form file upload (with the file field named "upload_file").您的 Python 脚本应该以正常的 HTTP 表单文件上传方式接收文件(文件字段名为“upload_file”)。 Use your framework supported method to get the file from the request.
使用您的框架支持的方法从请求中获取文件。
If you're concern about binary safety, or if it somehow failed, the other approach would be submitting the file as a base64 encoded string:如果您担心二进制安全,或者它以某种方式失败,另一种方法是将文件作为 base64 编码字符串提交:
<?php
$file_data = file_get_contents('/some');
$data = urlencode([
'upload_file' => base64_encode('path/to/your/file.xlsx'),
]);
// Send POST REQUEST
$context_options = array(
'http' => array(
'method' => 'POST',
'header' => "Content-type: application/x-www-form-urlencoded\r\n"
. "Content-Length: " . strlen($data) . "\r\n",
'content' => $data,
'timeout' => 10,
)
);
$context = stream_context_create($context_options);
$result = fopen('http://localhost:5000/api/extraction','r', false, $context);
You'd get the file data on your Python server in base64 encoded string on the field named "upload_file"
.您将在名为
"upload_file"
的字段上的 base64 编码字符串中获得 Python 服务器上的文件数据。 You need to decode to get the original binary content.您需要解码以获取原始二进制内容。
If you insist on your current 2-roundtrip approach, the simple solution is to have 2 different endpoints:如果您坚持当前的 2 往返方法,简单的解决方案是拥有 2 个不同的端点:
From your description, your deadlock is there because you're using the same script for these purpose.根据您的描述,您的死锁存在,因为您为此目的使用相同的脚本。 I don't see a reason why they can't be 2 separated script / route controller.
我看不出它们不能成为 2 个单独的脚本/路由 controller 的原因。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.