简体   繁体   English

使用JPA / Hibernate从SQL Server流式传输二进制数据

[英]Streaming binary data from SQL Server with JPA/Hibernate

I have large files store on SQL Server and I need to get them as InputStream s to stream to clients without exhausting memory. 我在SQL Server上存储了大文件,我需要将它们作为InputStream获得,以流式传输到客户端而不会耗尽内存。 I'm using Hibernate 5.2.11, WildFly 10.1 and Microsoft JDBC Driver 6.2.1 (which supports streaming to/from SQL Server). 我正在使用Hibernate 5.2.11,WildFly 10.1和Microsoft JDBC驱动程序6.2.1(支持到SQL Server的流传输)。

To map the InputStream to my entities, I need to create a custom Hibernate type, since Hibernate unfortunately does not provide such a mapping. 要将InputStream映射到我的实体,我需要创建一个自定义的Hibernate类型,因为不幸的是Hibernate没有提供这种映射。

import java.io.InputStream;
import java.io.Serializable;
import java.sql.Blob;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Types;
import java.util.Objects;

import org.hibernate.HibernateException;
import org.hibernate.engine.spi.SharedSessionContractImplementor;
import org.hibernate.type.BlobType;
import org.hibernate.type.SerializationException;
import org.hibernate.usertype.UserType;

public class InputStreamType implements UserType {

    private static final int[] SQL_TYPES = { Types.LONGVARBINARY };

    @Override
    public int[] sqlTypes() {
        return SQL_TYPES;
    }

    @Override
    public Class<?> returnedClass() {
        return InputStream.class;
    }

    @Override
    public boolean equals(Object x, Object y) throws HibernateException {
        return Objects.equals(x, y);
    }

    @Override
    public int hashCode(Object x) throws HibernateException {
        return Objects.hashCode(x);
    }

    @Override
    public Object nullSafeGet(ResultSet rs, String[] names, SharedSessionContractImplementor session, Object owner)
            throws HibernateException, SQLException {
        Blob blob = (Blob) BlobType.INSTANCE.nullSafeGet(rs, names, session, owner);
        return blob == null ? null : blob.getBinaryStream();
    }

    @Override
    public void nullSafeSet(PreparedStatement st, Object value, int index, SharedSessionContractImplementor session)
            throws HibernateException, SQLException {
        if (value == null) {
            st.setNull(index, SQL_TYPES[0]);
        } else {
            st.setBinaryStream(index, (InputStream) value);
        }
    }

    @Override
    public Object deepCopy(Object value) throws HibernateException {
        return value;
    }

    @Override
    public boolean isMutable() {
        return false;
    }

    @Override
    public Serializable disassemble(Object value) throws HibernateException {
        throw new SerializationException("Cannot serialize " + InputStream.class.getName(), null);
    }

    @Override
    public Object assemble(Serializable cached, Object owner) throws HibernateException {
        return cached;
    }

    @Override
    public Object replace(Object original, Object target, Object owner) throws HibernateException {
        return original;
    }

}

Then, I annotate my entity field to use this type: 然后,我注释我的实体字段以使用此类型:

@Entity
public class User {

    @Id
    Long id;

    @Lob
    @Type(type = "InputStreamType")
    InputStream picture;

    // ...

    InputStream getPicture() {
        return this.picture;
    }

    // ...

}

But when I try to read the stream, I get an exception: 但是当我尝试读取流时,出现异常:

@Service
public class UserService {

    @PersistenceContext
    EntityManager em;

    @Transactional
    public void testReadInputStream() {
        InputStream picture = em.find(User.class, 1L).getPicture();
        System.out.println(picture.getClass().getName());
        // prints: com.microsoft.sqlserver.jdbc.PLPInputStream
        picture.read();
        // throws: IOException The TDS protocol stream is not valid.
        //         at com.microsoft.sqlserver.jdbc.PLPInputStream.readBytes(PLPInputStream.java:304)
        //         at com.microsoft.sqlserver.jdbc.PLPInputStream.read(PLPInputStream.java:244)
    }

}

I tried to read the stream inside the InputStreamType.nullSafeGet and there it does NOT throw any exception. 我试图读取InputStreamType.nullSafeGet的流,并且那里没有抛出任何异常。

So, what happens to the stream after it returns from InputStreamType.nullSafeGet ? 那么,从InputStreamType.nullSafeGet返回后的流将如何处理? How can I still get it useable? 我如何仍然可以使用它?

UPDATE 1 更新1

I simplified the case to this: 我将情况简化为:

import java.io.IOException;
import java.io.InputStream;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

    public static void main(String[] args) throws Exception {
        Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
        Connection connection = DriverManager.getConnection(
                "jdbc:sqlserver://localhost:1433;DatabaseName=test");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("select picture from [User] where id = 1");
        resultSet.next();
        InputStream inputStream = resultSet.getBlob(1).getBinaryStream();
        System.out.println(inputStream.getClass().getName());
        // prints: com.microsoft.sqlserver.jdbc.PLPInputStream

        inputStream.read();
        // no exception

        statement.close();
        inputStream.read();
        // throws: IOException: The TDS protocol stream is not valid.
    }

}

May it be that returning from the CustomType the statement is closed and the stream becomes inaccessible? CustomType返回的语句可能是关闭的,并且流变得不可访问吗? If that's the case, how can I overcame this in JPA? 如果是这样,如何在JPA中克服呢?

UPDATE 2 更新2

My last finding: if I return the blob's stream, it gets closed when the session is closed; 我的最后发现:如果我返回Blob的流,则在会话关闭时它将关闭。 but, if I return the Blob itself, when the session is closed the stream gets somehow loaded in to memory, exhausting it for large streams (I think this behavior is triggered by a Hibernate proxy on the Blob ). 但是,如果我返回Blob本身,则在会话关闭时,流会以某种方式加载到内存中,从而耗尽大型流(我认为此行为是由Blob上的Hibernate代理触发的)。

Is there a way to let the session open when returning from a repository method, to have the stream accessible and flush data via HTTP without exhausting memory? 从存储库方法返回时,是否有办法让会话打开,以使流可通过HTTP访问并刷新数据而不会耗尽内存?

If I recall correctly, the SQL design doesn't feature streaming of data at all. 如果我没记错的话,SQL设计根本就没有数据流。 Either all the data has been returned to the SQL client, or it hasn't at all. 要么所有数据都已返回到SQL客户端,要么根本没有。 It is kind of an atomic thing. 这是一件原子的事情。

What you might be able to do is splitting your binary data into multiple smaller blocks and pull the blocks one after another. 您可能能够做的就是将二进制数据拆分为多个较小的块,然后将它们一个接一个地拉。 You can then close your network connection with the database while each of the data blocks is being sent to the HTTP client. 然后,可以在将每个数据块发送到HTTP客户端时关闭与数据库的网络连接。

Alternatively, you could store the binary data as files on your file system and stream from there. 或者,您可以将二进制数据作为文件存储在文件系统中,然后从那里进行流传输。 That however means that you have to take care about consistency between file system and database yourself. 但是,这意味着您必须自己注意文件系统和数据库之间的一致性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM