為什么多進程 python grpc 服務器不工作？

Question

我通過多進程池為每個子進程實例化一個 grpc 服務器。 當我使用多個客戶端訪問服務器時，發現了以下兩個問題：

所有客戶端訪問同一個服務器子進程
客戶提出 MaybeEncodingError

對了，我的開發環境是：

[OS]
ProductName:    Mac OS X
ProductVersion: 10.14.6
BuildVersion:   18G5033

[packages]
grpcio = '1.30.0'
grpcio-tools = '1.30.0'
multiprocess = "0.70.10"
grpcio-status = "1.30.0"
googleapis-common-protos = "1.52.0"

[requires]
python_version = "3.8.3"

這是服務器 output：

[PID 83287] Binding to 'localhost:52909'
[PID 83288] Starting new server.
[PID 83289] Starting new server.
[PID 83290] Starting new server.
[PID 83291] Starting new server.
[PID 83292] Starting new server.
[PID 83293] Starting new server.
[PID 83294] Starting new server.
[PID 83295] Starting new server.
[PID 83295] Determining primality of 2
[PID 83295] Determining primality of 9
[PID 83295] Determining primality of 23
[PID 83295] Determining primality of 16
[PID 83295] Determining primality of 10
[PID 83295] Determining primality of 3
[PID 83295] Determining primality of 24
[PID 83295] Determining primality of 17
[PID 83295] Determining primality of 11
[PID 83295] Determining primality of 25
[PID 83295] Determining primality of 4
[PID 83295] Determining primality of 18
[PID 83295] Determining primality of 5
[PID 83295] Determining primality of 12
[PID 83295] Determining primality of 19
[PID 83295] Determining primality of 26
[PID 83295] Determining primality of 6
[PID 83295] Determining primality of 13
[PID 83295] Determining primality of 27
[PID 83295] Determining primality of 20
[PID 83295] Determining primality of 7
[PID 83295] Determining primality of 14
[PID 83295] Determining primality of 8
[PID 83295] Determining primality of 28
[PID 83295] Determining primality of 15

如上所示，所有客戶端訪問同一個服務器子[PID 83295] 。 為什么？

這是客戶端錯誤信息：

Traceback (most recent call last):
  File "/Users/zhaolong/PycharmProjects/pipEnvGrpc/grpc_multi/client.py", line 96, in <module>
    main()
  File "/Users/zhaolong/PycharmProjects/pipEnvGrpc/grpc_multi/client.py", line 86, in main
    primes = _calculate_primes(args.server_address)
  File "/Users/zhaolong/PycharmProjects/pipEnvGrpc/grpc_multi/client.py", line 71, in _calculate_primes
    primality = worker_pool.map(_run_worker_query, check_range)
  File "/Users/zhaolong/.local/share/virtualenvs/pipEnvGrpc-7cHuZ_0E/lib/python3.8/site-packages/multiprocess/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/zhaolong/.local/share/virtualenvs/pipEnvGrpc-7cHuZ_0E/lib/python3.8/site-packages/multiprocess/pool.py", line 771, in get
    raise self._value
multiprocess.pool.MaybeEncodingError: Error sending result: '[isPrime: true
, , isPrime: true
, , isPrime: true
, , ]'. Reason: 'PicklingError("Can't pickle <class 'prime_pb2.Primality'>: it's not the same object as prime_pb2.Primality")'

如上所示，它引發了multiprocess.pool.MaybeEncodingError和PicklingError 。 為什么？

以下是完整代碼：

// the prime.proto from https://github.com/grpc/grpc/tree/v1.30.0/examples/python/multiprocessing

syntax = "proto3";

package prime;

// A candidate integer for primality testing.
message PrimeCandidate {
    // The candidate.
    int64 candidate = 1;
}

// The primality of the requested integer candidate.
message Primality {
    // Is the candidate prime?
    bool isPrime = 1;
}

// Service to check primality.
service PrimeChecker {
    // Determines the primality of an integer.
    rpc check (PrimeCandidate) returns (Primality) {}
}

# the server.py from https://github.com/grpc/grpc/tree/v1.30.0/examples/python/multiprocessing,
# and I have modify some palces.


"""An example of multiprocess concurrency with gRPC."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import multiprocessing
from concurrent import futures
import contextlib
import datetime
import logging
import math
import time
import socket
import sys
from multiprocess import pool
import grpc

from grpc_multi import prime_pb2
from grpc_multi import prime_pb2_grpc

_LOGGER = logging.getLogger(__name__)

_ONE_DAY = datetime.timedelta(days=1)
_PROCESS_COUNT = multiprocessing.cpu_count()
_THREAD_CONCURRENCY = _PROCESS_COUNT


def is_prime(n):
    for i in range(2, int(math.ceil(math.sqrt(n)))):
        if n % i == 0:
            return False
    else:
        return True


class PrimeChecker(prime_pb2_grpc.PrimeCheckerServicer):

    def check(self, request, context):
        _LOGGER.info('Determining primality of %s', request.candidate)
        return prime_pb2.Primality(isPrime=is_prime(request.candidate))


def _wait_forever(server):
    try:
        while True:
            time.sleep(_ONE_DAY.total_seconds())
    except KeyboardInterrupt:
        server.stop(None)


def _run_server(bind_address):
    """Start a server in a subprocess."""
    _LOGGER.info('Starting new server.')
    options = (('grpc.so_reuseport', 1),)

    server = grpc.server(futures.ThreadPoolExecutor(
        max_workers=_THREAD_CONCURRENCY,),
                         options=options)
    prime_pb2_grpc.add_PrimeCheckerServicer_to_server(PrimeChecker(), server)
    server.add_insecure_port(bind_address)
    server.start()
    # _wait_forever(server)
    server.wait_for_termination()


@contextlib.contextmanager
def _reserve_port():
    """Find and reserve a port for all subprocesses to use."""
    sock = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    if sock.getsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT) == 0:
        raise RuntimeError("Failed to set SO_REUSEPORT.")
    sock.bind(('', 0))
    try:
        yield sock.getsockname()[1]
    finally:
        sock.close()


def main():
    with _reserve_port() as port:
        bind_address = 'localhost:{}'.format(port)
        _LOGGER.info("Binding to '%s'", bind_address)
        sys.stdout.flush()
        addrs = [bind_address for _ in range(_PROCESS_COUNT)]
        server_pool = pool.Pool(processes=_PROCESS_COUNT)
        server_pool.map(_run_server, addrs)
        server_pool.close()
        # server_pool.join()

if __name__ == '__main__':
    handler = logging.StreamHandler(sys.stdout)
    formatter = logging.Formatter('[PID %(process)d] %(message)s')
    handler.setFormatter(formatter)
    _LOGGER.addHandler(handler)
    _LOGGER.setLevel(logging.INFO)
    main()

# the client.py from https://github.com/grpc/grpc/tree/v1.30.0/examples/python/multiprocessing,
# and I have modify some palces.


"""An example of multiprocessing concurrency with gRPC."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import atexit
import logging
from multiprocess import pool
import operator
import sys

import grpc

from grpc_multi import prime_pb2
from grpc_multi import prime_pb2_grpc

_PROCESS_COUNT = 4
_MAXIMUM_CANDIDATE = 100

# Each worker process initializes a single channel after forking.
# It's regrettable, but to ensure that each subprocess only has to instantiate
# a single channel to be reused across all RPCs, we use globals.
_worker_channel_singleton = None
_worker_stub_singleton = None

_LOGGER = logging.getLogger(__name__)


def _shutdown_worker():
    _LOGGER.info('Shutting worker process down.')
    if _worker_channel_singleton is not None:
        _worker_channel_singleton.close()


def _initialize_worker(server_address):
    global _worker_channel_singleton  # pylint: disable=global-statement
    global _worker_stub_singleton  # pylint: disable=global-statement
    _LOGGER.info('Initializing worker process.')
    _worker_channel_singleton = grpc.insecure_channel(server_address)
    _worker_stub_singleton = prime_pb2_grpc.PrimeCheckerStub(
        _worker_channel_singleton)
    atexit.register(_shutdown_worker)


def _run_worker_query(primality_candidate):
    _LOGGER.info('Checking primality of %s.', primality_candidate)
    return _worker_stub_singleton.check(
        prime_pb2.PrimeCandidate(candidate=primality_candidate))


def _calculate_primes(server_address):
    worker_pool = pool.Pool(processes=_PROCESS_COUNT,
                                       initializer=_initialize_worker,
                                       initargs=(server_address,))
    check_range = range(2, _MAXIMUM_CANDIDATE)
    primality = worker_pool.map(_run_worker_query, check_range)
    worker_pool.close()
    worker_pool.join()
    primes = zip(check_range, map(operator.attrgetter('isPrime'), primality))
    return tuple(primes)


def main():
    msg = 'Determine the primality of the first {} integers.'.format(
        _MAXIMUM_CANDIDATE)
    parser = argparse.ArgumentParser(description=msg)
    parser.add_argument('--server_address',
                        default='localhost:52909',
                        help='The address of the server (e.g. localhost:50051)')
    args = parser.parse_args()
    primes = _calculate_primes(args.server_address)
    print(primes)


if __name__ == '__main__':
    handler = logging.StreamHandler(sys.stdout)
    formatter = logging.Formatter('[PID %(process)d] %(message)s')
    handler.setFormatter(formatter)
    _LOGGER.addHandler(handler)
    _LOGGER.setLevel(logging.INFO)
    main()

我按照鄭麗迪的建議給client試了很多渠道，好像還是不行，是不是我的方法不對？

def _run_worker_query(param):
    server_address, primality_candidate = param
    worker_channel_singleton = grpc.insecure_channel(server_address)
    worker_stub_singleton = prime_pb2_grpc.PrimeCheckerStub(worker_channel_singleton)
    _LOGGER.info('Checking primality of %s.', primality_candidate)
    res = worker_stub_singleton.check(
        prime_pb2.PrimeCandidate(candidate=primality_candidate))
    return res.isPrime


def _calculate_primes(server_address):
    # worker_pool = pool.Pool(processes=_PROCESS_COUNT,
    #                                    initializer=_initialize_worker,
    #                                    initargs=(server_address,))
    worker_pool = pool.Pool(processes=_PROCESS_COUNT)
    check_range = range(2, _MAXIMUM_CANDIDATE)
    params = [(server_address, r) for r in check_range]
    primality = worker_pool.map(_run_worker_query, params)
    worker_pool.close()
    worker_pool.join()
    # primes = zip(check_range, map(operator.attrgetter('isPrime'), primality))
    primes = zip(check_range, primality)
    return tuple(primes)

服務器 output 是：

[PID 43279] Determining primality of 3
[PID 43279] Determining primality of 2
[PID 43279] Determining primality of 4
[PID 43279] Determining primality of 5
[PID 43279] Determining primality of 6
[PID 43279] Determining primality of 7
[PID 43279] Determining primality of 8
[PID 43279] Determining primality of 9

Answer 1

感謝您提供詳細的復制案例。 我能夠重現第一個問題，但不能重現第二個問題。

多處理 gRPC Python 服務器的使用是正確的。 在深入細節之前，我想重復一下關於多處理 gRPC Python 服務器的棘手問題。 gRPC Python 服務器啟動后不支持分叉。 要在 gRPC 中啟用多處理，應用程序需要盡早分叉（請參閱example-multiprocessing ）。

關於第一個問題，原因是客戶端只創建了一個頻道。 _initialize_worker使用全局變量_worker_channel_singleton來存儲通道，這意味着盡管存在多個進程，但只有一個通道。 一個 gRPC 通道意味着一個 TCP 連接。 SO_REUSEPORT 的自動負載平衡僅適用於 TCP 連接的粒度。 嘗試多個渠道，它應該可以解決您的問題。 （旁注：要對來自單個 gRPC 通道的請求進行負載平衡，可以使用 L7 負載平衡器，例如 L7 ILB）。

關於第二個問題，protobuf 消息不可腌制。 要跨進程傳遞 protobuf 消息，客戶端需要將它們序列化為字符串並在另一端反序列化。 這是由於 protobuf 的性質，序列化消息可以 map 到許多（如果不是無限）可能的原始消息。 沒有模式，就不可能解碼。

Answer 2

回答正在使用的單個 PID。

這似乎是 OSX 的問題。 我在 Ubuntu 中嘗試了這個，它適用於所有正在使用的服務器 PID。

Answer 3

如果 macOS 上的多處理不使用單獨的進程，您應該將啟動方法更改為spawn ：

if os.uname().sysname == "Darwin":
    multiprocessing.set_start_method('spawn')

為什么多進程 python grpc 服務器不工作？

問題描述

3 個解決方案

解決方案1
3 2020-07-09 17:14:24

解決方案2
0 2020-09-08 03:30:14

解決方案3
0 2022-03-10 05:24:31

為什么多進程 python grpc 服務器不工作？

問題描述

3 個解決方案

解決方案1 3 2020-07-09 17:14:24

解決方案2 0 2020-09-08 03:30:14

解決方案3 0 2022-03-10 05:24:31

解決方案1
3 2020-07-09 17:14:24

解決方案2
0 2020-09-08 03:30:14

解決方案3
0 2022-03-10 05:24:31