简体   繁体   中英

Streaming binary data from SQL Server with JPA/Hibernate

I have large files store on SQL Server and I need to get them as InputStream s to stream to clients without exhausting memory. I'm using Hibernate 5.2.11, WildFly 10.1 and Microsoft JDBC Driver 6.2.1 (which supports streaming to/from SQL Server).

To map the InputStream to my entities, I need to create a custom Hibernate type, since Hibernate unfortunately does not provide such a mapping.

import java.io.InputStream;
import java.io.Serializable;
import java.sql.Blob;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Types;
import java.util.Objects;

import org.hibernate.HibernateException;
import org.hibernate.engine.spi.SharedSessionContractImplementor;
import org.hibernate.type.BlobType;
import org.hibernate.type.SerializationException;
import org.hibernate.usertype.UserType;

public class InputStreamType implements UserType {

    private static final int[] SQL_TYPES = { Types.LONGVARBINARY };

    @Override
    public int[] sqlTypes() {
        return SQL_TYPES;
    }

    @Override
    public Class<?> returnedClass() {
        return InputStream.class;
    }

    @Override
    public boolean equals(Object x, Object y) throws HibernateException {
        return Objects.equals(x, y);
    }

    @Override
    public int hashCode(Object x) throws HibernateException {
        return Objects.hashCode(x);
    }

    @Override
    public Object nullSafeGet(ResultSet rs, String[] names, SharedSessionContractImplementor session, Object owner)
            throws HibernateException, SQLException {
        Blob blob = (Blob) BlobType.INSTANCE.nullSafeGet(rs, names, session, owner);
        return blob == null ? null : blob.getBinaryStream();
    }

    @Override
    public void nullSafeSet(PreparedStatement st, Object value, int index, SharedSessionContractImplementor session)
            throws HibernateException, SQLException {
        if (value == null) {
            st.setNull(index, SQL_TYPES[0]);
        } else {
            st.setBinaryStream(index, (InputStream) value);
        }
    }

    @Override
    public Object deepCopy(Object value) throws HibernateException {
        return value;
    }

    @Override
    public boolean isMutable() {
        return false;
    }

    @Override
    public Serializable disassemble(Object value) throws HibernateException {
        throw new SerializationException("Cannot serialize " + InputStream.class.getName(), null);
    }

    @Override
    public Object assemble(Serializable cached, Object owner) throws HibernateException {
        return cached;
    }

    @Override
    public Object replace(Object original, Object target, Object owner) throws HibernateException {
        return original;
    }

}

Then, I annotate my entity field to use this type:

@Entity
public class User {

    @Id
    Long id;

    @Lob
    @Type(type = "InputStreamType")
    InputStream picture;

    // ...

    InputStream getPicture() {
        return this.picture;
    }

    // ...

}

But when I try to read the stream, I get an exception:

@Service
public class UserService {

    @PersistenceContext
    EntityManager em;

    @Transactional
    public void testReadInputStream() {
        InputStream picture = em.find(User.class, 1L).getPicture();
        System.out.println(picture.getClass().getName());
        // prints: com.microsoft.sqlserver.jdbc.PLPInputStream
        picture.read();
        // throws: IOException The TDS protocol stream is not valid.
        //         at com.microsoft.sqlserver.jdbc.PLPInputStream.readBytes(PLPInputStream.java:304)
        //         at com.microsoft.sqlserver.jdbc.PLPInputStream.read(PLPInputStream.java:244)
    }

}

I tried to read the stream inside the InputStreamType.nullSafeGet and there it does NOT throw any exception.

So, what happens to the stream after it returns from InputStreamType.nullSafeGet ? How can I still get it useable?

UPDATE 1

I simplified the case to this:

import java.io.IOException;
import java.io.InputStream;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

    public static void main(String[] args) throws Exception {
        Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
        Connection connection = DriverManager.getConnection(
                "jdbc:sqlserver://localhost:1433;DatabaseName=test");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("select picture from [User] where id = 1");
        resultSet.next();
        InputStream inputStream = resultSet.getBlob(1).getBinaryStream();
        System.out.println(inputStream.getClass().getName());
        // prints: com.microsoft.sqlserver.jdbc.PLPInputStream

        inputStream.read();
        // no exception

        statement.close();
        inputStream.read();
        // throws: IOException: The TDS protocol stream is not valid.
    }

}

May it be that returning from the CustomType the statement is closed and the stream becomes inaccessible? If that's the case, how can I overcame this in JPA?

UPDATE 2

My last finding: if I return the blob's stream, it gets closed when the session is closed; but, if I return the Blob itself, when the session is closed the stream gets somehow loaded in to memory, exhausting it for large streams (I think this behavior is triggered by a Hibernate proxy on the Blob ).

Is there a way to let the session open when returning from a repository method, to have the stream accessible and flush data via HTTP without exhausting memory?

If I recall correctly, the SQL design doesn't feature streaming of data at all. Either all the data has been returned to the SQL client, or it hasn't at all. It is kind of an atomic thing.

What you might be able to do is splitting your binary data into multiple smaller blocks and pull the blocks one after another. You can then close your network connection with the database while each of the data blocks is being sent to the HTTP client.

Alternatively, you could store the binary data as files on your file system and stream from there. That however means that you have to take care about consistency between file system and database yourself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM