繁体   English   中英

使用 sqlalchemy 和 PostgreSQL 查询二进制数据

[英]Query binary data using sqlalchemy with PostgreSQL

我有一个简单的数据库,将附件存储为 blob。

CREATE TABLE public.attachment
(
  id integer NOT NULL,
  attachdata oid,
  CONSTRAINT attachment_pkey PRIMARY KEY (id)
)

-- Import a file
INSERT INTO attachment (id, attachdata) VALUES (1, lo_import('C:\\temp\blob_import.txt'))
-- Export back as file. 
SELECT lo_export(attachdata, 'C:\temp\blob_export_postgres.txt') FROM attachment WHERE id = 1

我可以直接使用 psycopg2 读回这个文件。

from psycopg2 import connect
con = connect(dbname="blobtest", user="postgres", password="postgres", host="localhost")
cur = con.cursor()
cur.execute("SELECT attachdata FROM attachment WHERE id = 1")
oid = cur.fetchone()[0]
obj = con.lobject(oid)
obj.export('C:\\temp\\blob_export_psycopg.txt')

当我使用 sqlalchemy 尝试同样的操作时,attachdata 是一个由零组成的字节串。 我已经使用 BLOB、LargeBinary 和 BINARY 等类型测试了以下代码。 attachdata 字节串的大小似乎是 OID 值。

from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, Binary
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

Base = declarative_base()
Session = sessionmaker()

engine = create_engine('postgresql://postgres:postgres@localhost:5432/blobtest', echo=True)
Base.metadata.create_all(engine)
Session.configure(bind=engine)

class Attachment(Base):
    __tablename__ ="attachment"
    id = Column(Integer, primary_key=True)
    attachdata = Column(Binary)

session = Session()
attachment = session.query(Attachment).get(1)
with open('C:\\temp\\blob_export_sqlalchemy.txt', 'wb') as f:
    f.write(attachment.attachdata)

我搜索了 sqlalchemy 文档和各种来源,但找不到如何使用 sqlalchemy 导出二进制数据的解决方案。

我有同样的问题。 似乎没有办法通过 ORM 获取大对象数据。 所以我把 ORM 和 psycopg2 引擎组合成这样:

from sqlalchemy import create_engine
from sqlalchemy import Column, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.dialects.postgresql import OID

Base = declarative_base()
session_factory = sessionmaker()

engine = create_engine('postgresql+psycopg2://postgres:postgres@localhost:5432/postgres', echo=True)
Base.metadata.create_all(engine)
session_factory.configure(bind=engine)
Session = scoped_session(session_factory)


class Attachment(Base):
    __tablename__ ="attachment"
    id = Column(Integer, primary_key=True)
    oid = Column(OID)

    @classmethod
    def insert_file(cls, filename):
        conn = engine.raw_connection()
        l_obj = conn.lobject(0, 'wb', 0)
        with open(filename, 'rb') as f:
            l_obj.write(f.read())
        conn.commit()
        conn.close()
        session = Session()
        attachment = cls(oid=l_obj.oid)
        session.add(attachment)
        session.commit()
        return attachment.id

    @classmethod
    def get_file(cls, attachment_id, filename):
        session = Session()
        attachment = session.query(Attachment).get(attachment_id)
        conn = engine.raw_connection()
        l_obj = conn.lobject(attachment.oid, 'rb')
        with open(filename, 'wb') as f:
            f.write(l_obj.read())
        conn.close()


if __name__ == '__main__':
    my_id = Attachment.insert_file(r'C:\path\to\file')
    Attachment.get_file(my_id, r'C:\path\to\file_out')

不是很优雅,但似乎有效。

更新:

我现在正在使用事件

from sqlalchemy import create_engine, event
from sqlalchemy import Column, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.dialects.postgresql import OID

Base = declarative_base()
session_factory = sessionmaker()

engine = create_engine('postgresql+psycopg2://postgres:postgres@localhost:5432/postgres', echo=True)
Base.metadata.create_all(engine)
session_factory.configure(bind=engine)
Session = scoped_session(session_factory)

class Data(Base):
    __tablename__ = "attachment"
    id = Column(Integer, primary_key=True)
    oid = Column(OID)


@event.listens_for(Data, 'after_delete')
def remove_large_object_after_delete(_, connection, target):
    raw_connection = connection.connection
    l_obj = raw_connection.lobject(target.oid, 'n')
    l_obj.unlink()
    raw_connection.commit()


@event.listens_for(Data, 'before_insert')
def add_large_object_before_insert(_, connection, target):
    raw_connection = connection.connection
    l_obj = raw_connection.lobject(0, 'wb', 0)
    target.oid = l_obj.oid
    l_obj.write(target.ldata)
    raw_connection.commit()


@event.listens_for(Data, 'load')
def inject_large_object_after_load(target, _):
    session = object_session(target)
    conn = session.get_bind().raw_connection()
    l_obj = conn.lobject(target.oid, 'rb')
    target.ldata = l_obj.read()

if __name__ == '__main__':
   session = Session()  
   # Put
   data = Data()
   data.ldata = 'your large data'
   session.add(data)
   session.commit()

   id = data.id

   # Get
   data2 = session.query(Data).get(id)
   print(data.ldata) # Your large data is here

   # Delete
   session.delete(data)
   session.delete(data2)
   session.commit()   

   session.flush()
   session.close()

到目前为止效果很好。

我不明白为什么现在 postgres 大对象如此被忽视。 我大量使用它们。 或者假设我想要,但它具有挑战性,尤其是在 asyncio 中......

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM