將 Numpy 數組圖像編碼為圖像類型（.png 等）以將其與 GCloud Vision API 一起使用 - 無需 OpenCV

Question

在決定不使用OpenCV因為我只使用它的一個函數后，我想用其他函數替換cv2.imencode()函數。 目標是將2D Numpy Array轉換為圖像格式（如 .png）以將其發送到GCloud Vision API 。

這是我到現在為止一直在使用的：

content = cv2.imencode('.png', image)[1].tostring()
image = vision.types.Image(content=content)

現在我希望在不使用OpenCV 的情況下實現相同的目標。

到目前為止我發現的東西：

Vision API 需要base64編碼的數據
Imencode 返回特定圖像類型的編碼字節

我認為值得注意的是，我的 numpy 數組是一個只有 2 維的二進制圖像，並且整個函數將在 API 中使用，因此要避免將 png 保存到磁盤並重新加載它。

Answer 1

純 Python 的 PNG 編寫器

如果您堅持使用或多或少的純python，那么ideaman對這個問題的回答中的以下函數很有用。

def write_png(buf, width, height):
    """ buf: must be bytes or a bytearray in Python3.x,
        a regular string in Python2.x.
    """
    import zlib, struct

    # reverse the vertical line order and add null bytes at the start
    width_byte_4 = width * 4
    raw_data = b''.join(
        b'\x00' + buf[span:span + width_byte_4]
        for span in range((height - 1) * width_byte_4, -1, - width_byte_4)
    )

    def png_pack(png_tag, data):
        chunk_head = png_tag + data
        return (struct.pack("!I", len(data)) +
                chunk_head +
                struct.pack("!I", 0xFFFFFFFF & zlib.crc32(chunk_head)))

    return b''.join([
        b'\x89PNG\r\n\x1a\n',
        png_pack(b'IHDR', struct.pack("!2I5B", width, height, 8, 6, 0, 0, 0)),
        png_pack(b'IDAT', zlib.compress(raw_data, 9)),
        png_pack(b'IEND', b'')])

將 Numpy 數組寫入 PNG 格式的字節文字，編碼為 base64

為了將灰度圖像表示為 RGBA 圖像，我們將矩陣堆疊成 4 個通道並設置 alpha 通道。 （假設您的 2d numpy 數組稱為“img”）。 由於 PNG 坐標的工作方式，我們還垂直翻轉了 numpy 數組。

import base64
img_rgba = np.flipud(np.stack((img,)*4, axis=-1)) # flip y-axis
img_rgba[:, :, -1] = 255 # set alpha channel (png uses byte-order)
data = write_png(bytearray(img_rgba), img_rgba.shape[1], img_rgba.shape[0])
data_enc = base64.b64encode(data)

測試編碼是否正常工作

最后，為了確保編碼正常工作，我們對 base64 字符串進行解碼，並將輸出作為“test_out.png”寫入磁盤。 檢查這是否與您開始使用的圖像相同。

with open("test_out.png", "wb") as fb:
   fb.write(base64.decodestring(data_enc))

替代方案：只需使用 PIL

但是，我假設您首先使用某個庫來實際讀取圖像？ （除非您正在生成它們）。 大多數用於讀取圖像的庫都支持此類事情。 假設您正在使用 PIL，您還可以嘗試以下代碼片段（來自此答案）。 它只是將文件保存在內存中，而不是磁盤上，並使用它來生成一個 base64 字符串。

in_mem_file = io.BytesIO()
img.save(in_mem_file, format = "PNG")
# reset file pointer to start
in_mem_file.seek(0)
img_bytes = in_mem_file.read()

base64_encoded_result_bytes = base64.b64encode(img_bytes)
base64_encoded_result_str = base64_encoded_result_bytes.decode('ascii')

將 Numpy 數組圖像編碼為圖像類型（.png 等）以將其與 GCloud Vision API 一起使用 - 無需 OpenCV

問題描述

1 個解決方案

解決方案1
4 已采納 2019-06-12 17:12:47

純 Python 的 PNG 編寫器

將 Numpy 數組寫入 PNG 格式的字節文字，編碼為 base64

測試編碼是否正常工作

替代方案：只需使用 PIL

將 Numpy 數組圖像編碼為圖像類型（.png 等）以將其與 GCloud Vision API 一起使用 - 無需 OpenCV

問題描述

1 個解決方案

解決方案1 4 已采納 2019-06-12 17:12:47

純 Python 的 PNG 編寫器

將 Numpy 數組寫入 PNG 格式的字節文字，編碼為 base64

測試編碼是否正常工作

替代方案：只需使用 PIL

解決方案1
4 已采納 2019-06-12 17:12:47