如何使用 JPA 將大型 Blob 從數據庫流式傳輸到應用程序？

Question

我有一個 JPA 實體類，它包含一個像這樣的 blob 字段：

@Entity
public class Report {
    private Long id;
    private byte[] content;

    @Id
    @Column(name = "report_id")
    @SequenceGenerator(name = "REPORT_ID_GENERATOR", sequenceName = "report_sequence_id", allocationSize = 1)
    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "REPORT_ID_GENERATOR")
    public Long getId() {
        return id;
    }
    public void setId(Long id) {
        this.id = id;
    }
    @Lob
    @Column(name = "content")
    public byte[] getContent() {
        return content;
    }

    public void setContent(byte[] content) {
        this.content = content;
    }
}

我在數據庫中的記錄中插入了一些大數據（超過 3 個演出）（使用 DBMS 過程）。

應用程序用戶應該能夠下載這些記錄的內容，因此我實現了一種將獲取的結果流式傳輸到客戶端瀏覽器的方法。

問題是，由於 JPQL 選擇查詢傾向於首先從數據庫中獲取整個對象，然后將其提供給應用程序，因此每當我嘗試使用 JPA 訪問此記錄時，我都無法分配足夠的內存異常。

我已經看到使用 JDBC 連接嘗試從數據庫流式傳輸數據的一些解決方案，但我找不到任何 JPA 特定的解決方案。

有沒有人知道我應該如何解決這個問題？

Answer 1

這是一個遲到的答案，但對於那些仍在尋找解決方案的人來說，我發現了 Thorben Janssen 關於 Java 思想的一篇好文章。 缺點是它是特定於 Hibernate 的，但您似乎還是會使用它。 基本上解決方案是在您的實體中使用 java.sql.Blob 數據類型屬性

@Entity
public class Book {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @Lob
    private Clob content;

    @Lob
    private Blob cover;

    ...
}

然后你使用 Hibernate 的 BlobProxy，它提供了一個 OutputStream。 但是看看這里的文章

Answer 2

我通過以下方式解決了這個問題，請注意，此解決方案可能僅適用於 JPA 的休眠實現。

首先，我從實體管理器那里獲得了一個休眠會話
然后我創建了一個准備好的語句，用於使用從會話中提取的連接來選擇 blob
然后我從准備好的語句的結果集中生成了一個輸入流。

這是用於流式傳輸內容的 DAO 類：

@Repository
public class ReportDAO{

private static final Logger logger = LoggerFactory.getLogger(ReportDAO.class);

@PersistenceContext
private  EntityManager entityManager; 

//---streamToWrite is the stream that we used to deliver the content to client
public void streamReportContent(final Long id, final OutputStream streamToWrite) {
        try{
            entityManager=entityManager.getEntityManagerFactory().createEntityManager();
            Session session = entityManager.unwrap(Session.class);
            session.doWork(new Work() {
                @Override
                public void execute(Connection connection) throws SQLException
                {
                    PreparedStatement stmt=connection.prepareStatement("SELECT content FROM report where id=?");
                    stmt.setLong(1,id);
                    ResultSet rs = stmt.executeQuery();
                    rs.next();
                    if(rs != null)
                    {
                        Blob blob = rs.getBlob(1);
                        InputStream input = blob.getBinaryStream();
                        byte[] buffer = new byte[1024];

                        try {
                            while (input.read(buffer) > 0) {
                                String str = new String(buffer, StandardCharsets.UTF_8);
                                streamToWrite.write(buffer);
                            }

                            input.close();

                        } catch (IOException e) {
                            logger.error("Failure in streaming report", e);
                        }



                        rs.close();
                    }

                }
            });
        }
        catch (Exception e){
            logger.error("A problem happened during the streaming problem", e);
        }
}

Answer 3

由於您使用關系數據庫將大型（千兆字節）數據文件存儲在數據庫中作為 BLOB 不是一個好習慣。 相反，常見的做法是數據本身以文件的形式存儲在服務器上（可能是 FTP），而元數據（文件的路徑以及服務器等）存儲在數據庫列中。 在這種情況下，將這些數據流式傳輸到客戶端變得更加容易。

Answer 4

您應該查看社區項目Spring Content 。 該項目為您提供了一種類似於 Spring Data 的內容方法。 它是非結構化數據（文檔、圖像、視頻等），Spring Data 是結構化數據。 您可以使用以下內容添加它：-

pom.xml（Spring Boot 啟動器也可用）

   <!-- Java API -->
   <dependency>          
      <groupId>com.github.paulcwarren</groupId>
      <artifactId>spring-content-jpa</artifactId>
      <version>0.9.0</version>
   </dependency>
   <!-- REST API -->
   <dependency>
      <groupId>com.github.paulcwarren</groupId>
      <artifactId>spring-content-rest</artifactId>
      <version>0.9.0</version>
   </dependency>

配置

@Configuration
@EnableJpaStores
@Import("org.springframework.content.rest.config.RestConfiguration.class") <!-- enables REST API)
public class ContentConfig {

   <!-- specify the resource specific to your database --> 
   @Value("/org/springframework/content/jpa/schema-drop-h2.sql")
   private ClasspathResource dropBlobTables;

   <!-- specify the resource specific to your database --> 
   @Value("/org/springframework/content/jpa/schema-h2.sql")
   private ClasspathResource createBlobTables;

   @Bean
   DataSourceInitializer datasourceInitializer() {
     ResourceDatabasePopulator databasePopulator =
            new ResourceDatabasePopulator();

     databasePopulator.addScript(dropBlobTables);
     databasePopulator.addScript(createBlobTables);
     databasePopulator.setIgnoreFailedDrops(true);

     DataSourceInitializer initializer = new DataSourceInitializer();
     initializer.setDataSource(dataSource());
     initializer.setDatabasePopulator(databasePopulator);

     return initializer;
   }
}

注意：如果您使用 Spring Boot 啟動器，則不需要此配置。

要關聯內容，請將 Spring Content 注釋添加到您的帳戶實體。

例子.java

@Entity
public class Report {

   // replace @Lob field with:

   @ContentId
   private String contentId;

   @ContentLength
   private long contentLength = 0L;

   // if you have rest endpoints
   @MimeType
   private String mimeType = "text/plain";

創建一個“商店”：

示例商店.java

@StoreRestResource(path="reportContent")
public interface ReportContentStore extends ContentStore<Report, String> {
}

這就是創建 REST 端點 @ /reportContent 。 當您的應用程序啟動時，Spring Content 將查看您的依賴項（ ReportContentStore Spring Content JPA/REST），查看您的ReportContentStore接口並為 JPA 注入該接口的實現。 它還將注入一個@Controller ，將http請求轉發到該實現。 這使您不必自己實現任何這些。

所以...

curl -X POST /reportsContent/{reportId} -F 'data=@path/to/local/file'

將path/to/local/file的內容存儲在數據庫中，並將其與 id 為reportId的報告實體相關聯。

curl /reportContent/{reportId}

將再次獲取它等等......支持完整的CRUD。

有一對夫婦快速入門指南和視頻的這里。 參考指南在這里。

HTH

Answer 5

我有一個像你一樣的問題，我需要在一個字段中存儲一個 JSON，所以當我使用 BLOB 時，我給自己帶來了很多不必要的麻煩。 您正在使用的BLOB的內容類型的數據，我謹建議你使用CLOB數據，如果它在字符格式。

總結我的答案，如果您使用的是ORACLE數據庫（這是一個總是會導致說它的語言問題的數據庫），請使用波紋管格式作為指南或最佳實踐，它基於 oracle 文檔本身，以解決您的問題：

@Lob @Basic(fetch=LAZY)
@Column(name="REPORT")
protected String report;

祝你好運！

Answer 6

也許您可以使用壓縮算法（如有損和無損壓縮、Huffman、Facebook 的 Zstandard）壓縮您的文件，然后將其存儲在您的數據庫中，並通過解壓縮它們來檢索。

如何使用 JPA 將大型 Blob 從數據庫流式傳輸到應用程序？

問題描述

6 個解決方案

解決方案1
3 2019-12-10 11:40:08

解決方案2
2 2019-07-30 12:14:15

解決方案3
1 2019-07-30 09:41:16

解決方案4
0 2019-07-31 21:28:53

解決方案5
0 2021-03-28 05:56:11

解決方案6
-2 2019-07-30 09:57:04

如何使用 JPA 將大型 Blob 從數據庫流式傳輸到應用程序？

問題描述

6 個解決方案

解決方案1 3 2019-12-10 11:40:08

解決方案2 2 2019-07-30 12:14:15

解決方案3 1 2019-07-30 09:41:16

解決方案4 0 2019-07-31 21:28:53

解決方案5 0 2021-03-28 05:56:11

解決方案6 -2 2019-07-30 09:57:04

解決方案1
3 2019-12-10 11:40:08

解決方案2
2 2019-07-30 12:14:15

解決方案3
1 2019-07-30 09:41:16

解決方案4
0 2019-07-31 21:28:53

解決方案5
0 2021-03-28 05:56:11

解決方案6
-2 2019-07-30 09:57:04