简体   繁体   English

如何使用Hibernate在Oracle中保留大型BLOB(> 100MB)

[英]How to persist LARGE BLOBs (>100MB) in Oracle using Hibernate

I'm struggling to find a way to insert LARGE images (>100MB, mostly TIFF format) in my Oracle database, using BLOB columns. 我正在努力找到一种方法,使用BLOB列在我的Oracle数据库中插入大图像(> 100MB,主要是TIFF格式)。

I've searched thoroughly across the web and even in StackOverflow, without being able to find an answer to this problem. 我已经在网络上进行了彻底搜索,甚至在StackOverflow中也没有找到这个问题的答案。
First of all, the problem...then a short section on the relevant code (java classes/configuration), finally a third section where i show the junit test i've written to test image persistence (i receive the error during my junit test execution) 首先,问题......然后是相关代码(java类/配置)的一小部分 ,最后是第三部分 ,其中我展示了我写的测试图像持久性的junit测试(我在junit期间收到错误)测试执行)

Edit: i've added a section, at the end of the question, where i describe some tests and analysis using JConsole 编辑:我在问题的最后添加了一个部分,在那里我用JConsole描述了一些测试和分析

The problem 问题

I receive an java.lang.OutOfMemoryError: Java heap space error using hibernate and trying to persist very large images/documents: 我收到一个java.lang.OutOfMemoryError: Java heap space使用hibernate的java.lang.OutOfMemoryError: Java heap space错误,并试图保留非常大的图像/文档:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:190)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:123)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:47)
at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:101)
at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:91)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:283)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:278)
at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:89)
at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2184)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2430)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2874)
at org.hibernate.action.EntityInsertAction.execute(EntityInsertAction.java:79)
at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:273)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:265)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:184)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:51)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1216)
at it.paoloyx.blobcrud.manager.DocumentManagerTest.testInsertDocumentVersion(DocumentManagerTest.java:929)

The code (domain objects, repository classes, configuration) 代码(域对象,存储库类,配置)

Here is the stack of technologies i'm using (from DB to business logic tier). 这是我正在使用的一堆技术(从数据库到业务逻辑层)。 I use JDK6. 我使用JDK6。

  • Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod Oracle数据库10g企业版10.2.0.4.0版 - 产品介绍
  • ojdbc6.jar (for 11.2.0.3 release) ojdbc6.jar(适用于11.2.0.3版本)
  • Hibernate 4.0.1 Final Hibernate 4.0.1 Final
  • Spring 3.1.GA RELEASE 春季3.1.GA发布

I've two domain classes, mapped in a one-to-many fashion. 我有两个域类,以一对多的方式映射。 A DocumentVersion has many DocumentData , each of one can represent different binary content for the same DocumentVersion . DocumentVersion有许多DocumentData ,每个DocumentData可以表示同一DocumentVersion不同二进制内容。

Relevant extract from DocumentVersion class: DocumentVersion类的相关摘录:

@Entity
@Table(name = "DOCUMENT_VERSION")
public class DocumentVersion implements Serializable {

private static final long serialVersionUID = 1L;
private Long id;
private Set<DocumentData> otherDocumentContents = new HashSet<DocumentData>(0);


@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOV_ID", nullable = false)
public Long getId() {
    return id;
}

@OneToMany
@Cascade({ CascadeType.SAVE_UPDATE })
@JoinColumn(name = "DOD_DOCUMENT_VERSION")
public Set<DocumentData> getOtherDocumentContents() {
    return otherDocumentContents;
}

Relevant extract from DocumentData class: DocumentData类的相关摘录:

@Entity
@Table(name = "DOCUMENT_DATA")
public class DocumentData {

private Long id;

/**
 * The binary content (java.sql.Blob)
 */
private Blob binaryContent;

@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOD_ID", nullable = false)
public Long getId() {
    return id;
}

@Lob
@Column(name = "DOD_CONTENT")
public Blob getBinaryContent() {
    return binaryContent;
}

Here are my Spring and Hibernate configuration main parameters: 这是我的Spring和Hibernate配置主要参数:

<bean id="sessionFactory"
    class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
    <property name="dataSource" ref="dataSource" />
    <property name="packagesToScan" value="it.paoloyx.blobcrud.model" />
    <property name="hibernateProperties">
        <props>
            <prop key="hibernate.dialect">org.hibernate.dialect.Oracle10gDialect</prop>
            <prop key="hibernate.hbm2ddl.auto">create</prop>
            <prop key="hibernate.jdbc.batch_size">0</prop>
            <prop key="hibernate.jdbc.use_streams_for_binary">true</prop>
        </props>
    </property>
</bean>
<bean class="org.springframework.orm.hibernate4.HibernateTransactionManager"
    id="transactionManager">
    <property name="sessionFactory" ref="sessionFactory" />
</bean>
<tx:annotation-driven transaction-manager="transactionManager" />

My datasource definition: 我的数据源定义:

<bean class="org.apache.commons.dbcp.BasicDataSource"
    destroy-method="close" id="dataSource">
    <property name="driverClassName" value="${database.driverClassName}" />
    <property name="url" value="${database.url}" />
    <property name="username" value="${database.username}" />
    <property name="password" value="${database.password}" />
    <property name="testOnBorrow" value="true" />
    <property name="testOnReturn" value="true" />
    <property name="testWhileIdle" value="true" />
    <property name="timeBetweenEvictionRunsMillis" value="1800000" />
    <property name="numTestsPerEvictionRun" value="3" />
    <property name="minEvictableIdleTimeMillis" value="1800000" />
    <property name="validationQuery" value="${database.validationQuery}" />
</bean>

where properties are taken from here: 属性来自这里:

database.driverClassName=oracle.jdbc.OracleDriver
database.url=jdbc:oracle:thin:@localhost:1521:devdb
database.username=blobcrud
database.password=blobcrud
database.validationQuery=SELECT 1 from dual

I've got a service class, that delegates to a repository class: 我有一个服务类,它委托给一个存储库类:

@Transactional
public class DocumentManagerImpl implements DocumentManager {

DocumentVersionDao documentVersionDao;

public void setDocumentVersionDao(DocumentVersionDao documentVersionDao) {
    this.documentVersionDao = documentVersionDao;
}

and now the relevant extracts from repository classes: 现在来自存储库类的相关摘录:

public class DocumentVersionDaoHibernate implements DocumentVersionDao {

@Autowired
private SessionFactory sessionFactory;

@Override
public DocumentVersion saveOrUpdate(DocumentVersion record) {
    this.sessionFactory.getCurrentSession().saveOrUpdate(record);
    return record;
}

The JUnit test that causes the error 导致错误的JUnit测试

If i run the following unit test i've got the aforementioned error ( java.lang.OutOfMemoryError: Java heap space ): 如果我运行以下单元测试我得到上述错误( java.lang.OutOfMemoryError: Java heap space ):

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath*:META-INF/spring/applicationContext*.xml" })
@Transactional
public class DocumentManagerTest {

@Autowired
protected DocumentVersionDao documentVersionDao;

@Autowired
protected SessionFactory sessionFactory;

@Test
public void testInsertDocumentVersion() throws SQLException {

    // Original mock document content
    DocumentData dod = new DocumentData();
    // image.tiff is approx. 120MB
    File veryBigFile = new File("/Users/paoloyx/Desktop/image.tiff");
    try {
        Session session = this.sessionFactory.getCurrentSession();
        InputStream inStream = FileUtils.openInputStream(veryBigFile);
        Blob blob = Hibernate.getLobCreator(session).createBlob(inStream, veryBigFile.length());
        dod.setBinaryContent(blob);
    } catch (IOException e) {
        e.printStackTrace();
        dod.setBinaryContent(null);
    }

    // Save a document version linked to previous document contents
    DocumentVersion dov = new DocumentVersion();
    dov.getOtherDocumentContents().add(dod);
    documentVersionDao.saveOrUpdate(dov);
    this.sessionFactory.getCurrentSession().flush();

    // Clear session, then try retrieval
    this.sessionFactory.getCurrentSession().clear();
    DocumentVersion dbDov = documentVersionDao.findByPK(insertedId);
    Assert.assertNotNull("Il document version ritornato per l'id " + insertedId + " è nullo", dbDov);
    Assert.assertNotNull("Il document version recuperato non ha associato contenuti aggiuntivi", dbDov.getOtherDocumentContents());
    Assert.assertEquals("Il numero di contenuti secondari non corrisponde con quello salvato", 1, dbDov.getOtherDocumentContents().size());
}

The same code works against a PostreSQL 9 installation. 相同的代码适用于PostreSQL 9安装。 The images is being written in the database. 图像正在数据库中写入。 Debugging my code, i've been able to find that the PostgreSQL jdbc drivers writes on the database using a buffered output stream....while the Oracle OJDBC driver tries to allocate all at once all the byte[] representing the image. 调试我的代码,我已经能够发现PostgreSQL jdbc驱动程序使用缓冲输出流写入数据库....而Oracle OJDBC驱动程序尝试一次性分配表示图像的所有byte[]

From the error stack: 从错误堆栈:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)

Is the error due to this behavior? 是否由于此行为导致错误? Can anyone give me some insights on this problem? 谁能给我一些关于这个问题的见解?

Thanks everyone. 感谢大家。

Memory Tests with JConsole 使用JConsole进行内存测试

Thanks to the suggestions received for my question, i've tried to do some simple tests to show memory usage of my code using two different jdbc drivers, one for PostgreSQL and one for Oracle. 感谢收到的关于我的问题的建议,我试图做一些简单的测试来显示我的代码的内存使用情况,使用两个不同的jdbc驱动程序,一个用于PostgreSQL,一个用于Oracle。 Test setup: 测试设置:

  1. The test has been conducted using the JUnit test described in the previous section. 测试是使用上一节中描述的JUnit测试进行的。
  2. JVM Heap Size has been set to 512MB, using parameter -Xmx512MB JVM堆大小已设置为512MB,使用参数-Xmx512MB
  3. For Oracle database, I've used ojdbc6.jar driver 对于Oracle数据库,我使用了ojdbc6.jar驱动程序
  4. For Postgres database, I've used 9.0-801.jdbc3 driver (via Maven) 对于Postgres数据库,我使用了9.0-801.jdbc3驱动程序 (通过Maven)

First test, with a file of approx 150MB 首次测试,文件大约150MB

In this first test, both Oracle and Postgres passed the test (this is BIG news). 在第一次测试中,Oracle和Postgres都通过了测试(这是大新闻)。 The file is sized 1/3 of available JVM heap size. 该文件的大小为可用JVM堆大小的1/3。 Here the picture of JVM memory consumption: 这里是JVM内存消耗的图片:

Testing Oracle, 512MB Heap Size, 150MB file 测试Oracle,512MB堆大小,150MB文件 测试Oracle,512MB堆大小,150MB文件

Testing PostgreSQL, 512MB Heap Size, 150MB file 测试PostgreSQL,512MB堆大小,150MB文件 测试PostgreSQL,512MB堆大小,150MB文件

Second test, with a file of approx 485MB 第二次测试,文件大约485MB

In this second test, only Postgres passed the test and Oracle failed it . 在第二次测试中,只有Postgres通过了测试而Oracle 失败了。 The file is sized very near the size of the available JVM heap space. 该文件的大小非常接近可用JVM堆空间的大小。 Here the picture of JVM memory consumption: 这里是JVM内存消耗的图片:

Testing Oracle, 512MB Heap Size, 485MB file 测试Oracle,512MB堆大小,485MB文件 测试Oracle,512MB堆大小,485MB文件

Testing PostgreSQL, 512MB Heap Size, 485MB file 测试PostgreSQL,512MB堆大小,485MB文件 测试PostgreSQL,512MB堆大小,485MB文件

Analysis of the tests: 分析测试:

It seems that PostgreSQL driver handles memory without surpassing a certain threshold, while Oracle driver behaves very differently. 看起来PostgreSQL驱动程序在不超过某个阈值的情况下处理内存,而Oracle驱动程序的行为却截然不同。

I can't honestly explain why Oracle jdbc driver leads me to error (the same java.lang.OutOfMemoryError: Java heap space ) when using file sized near the available heap space. 我无法诚实地解释为什么Oracle jdbc驱动程序在使用可用堆空间附近的文件大小时会导致我出错(同样的java.lang.OutOfMemoryError: Java heap space )。

Is there anyone that can give me more insights? 有没有人可以给我更多的见解? Thanks a lot for you help :) 非常感谢你的帮助:)

I was having the same problems as you in attempting to map using "blob" type. 在尝试使用“blob”类型进行映射时,我遇到了与您相同的问题。 Here is a link to a post I made on the hibernate site: https://forum.hibernate.org/viewtopic.php?p=2452481#p2452481 这是我在hibernate网站上发布的帖子的链接: https//forum.hibernate.org/viewtopic.php?p = 2452481#p2452481

Hibernate 3.6.9 Hibernate 3.6.9
Oracle Driver 11.2.0.2.0 Oracle Driver 11.2.0.2.0
Oracle Database 11.2.0.2.0 Oracle Database 11.2.0.2.0

To fix the problem I used code that had a custom UserType for the Blob, I had the return type be java.sql.Blob. 为了解决这个问题,我使用了具有Blob的自定义UserType的代码,我的返回类型是java.sql.Blob。

Here are the key method implementations of this UserType: 以下是此UserType的关键方法实现:

public Object nullSafeGet(ResultSet rs, String[] names, Object owner) throws HibernateException, SQLException {

   Blob blob = rs.getBlob(names[0]);
   if (blob == null)
      return null;

   return blob;
}

public void nullSafeSet(PreparedStatement st, Object value, int index)
     throws HibernateException, SQLException {
   if (value == null) {
      st.setNull(index, sqlTypes()[0]);
   }
   else {
      InputStream in = null;
      OutputStream out = null;
      // oracle.sql.BLOB
      BLOB tempBlob = BLOB.createTemporary(st.getConnection(), true, BLOB.DURATION_SESSION);
      tempBlob.open(BLOB.MODE_READWRITE);
      out = tempBlob.getBinaryOutputStream();
      Blob valueAsBlob = (Blob) value;
      in = valueAsBlob.getBinaryStream();
      StreamUtil.toOutput(in, out);
      out.flush();
      StreamUtil.close(out);
      tempBlob.close();
      st.setBlob(index, tempBlob);
      StreamUtil.close(in);
   }
}

Personally I store files up to 200MB in Oracle BLOB columns using Hibernate, so I can assure it works. 我个人使用Hibernate在Oracle BLOB列中存储高达200MB的文件,因此我可以保证它可以工作。 So... 所以...

You should try newer version of Oracle JDBC driver. 您应该尝试更新版本的Oracle JDBC驱动程序。 It seems that this behavior of using byte arrays instead of streams was changed a little bit over time. 似乎这种使用字节数组而不是流的行为随着时间的推移而改变了一点点。 And the drivers are backward compatible. 并且驱动程序向后兼容。 I'm not sure, if that's going to fix your problem, but it works for me. 我不确定,如果这会解决你的问题,但它对我有用。 Additionally You should switch to org.hibernate.dialect.Oracle10gDialect - which retires the use of the oracle.jdbc.driver package in favor of oracle.jdbc - and it might also help. 另外你应该切换到org.hibernate.dialect.Oracle10gDialect - 它退出使用oracle.jdbc.driver包而不是oracle.jdbc - 它也可能有所帮助。

I just discovered this question when I was having the same problem with Oracle and Hibernate. 当我遇到与Oracle和Hibernate相同的问题时,我才发现了这个问题。 The issue is in the Hibernate blob handling. 问题在于Hibernate blob处理。 It seems to copy the blob to memory depending on the Dialect in use. 它似乎根据使用的Dialect将blob复制到内存中。 I guess they do so, because it's required by some databases/drivers. 我猜他们会这样做,因为某些数据库/驱动程序需要它。 For Oracle though, this behaviour does not seem to be required. 但对于Oracle来说,似乎并不需要这种行为。

The fix is pretty simple, just create a custom OracleDialect containing this code: 修复非常简单,只需创建一个包含以下代码的自定义OracleDialect:

public class Oracle10DialectWithoutInputStreamToInsertBlob extends Oracle10gDialect {
    public boolean useInputStreamToInsertBlob() {
        return false;
    }
}

Next you need to configure your session factory to use this Dialect. 接下来,您需要配置会话工厂以使用此Dialect。 I've tested it with the ojdbc6-11.2.0.1.0 driver towards Oracle 11g, and confirmed that this fixes the issue with memory consumption. 我已经使用ojdbc6-11.2.0.1.0驱动程序对Oracle 11g进行了测试,并确认这解决了内存消耗问题。

If some of you tries this with another Oracle database and/or with a different Oracle driver I would love to hear if it works for you. 如果你们中的一些人尝试使用另一个Oracle数据库和/或使用不同的Oracle驱动程序,我很乐意听到它是否适合你。 If it works with several configurations, I'll send a pull request to the Hibernate team. 如果它适用于多种配置,我会向Hibernate团队发送拉取请求。

It's not best solution, but you can allow Java to use more memory with -Xmx parametr 这不是最佳解决方案,但您可以允许Java使用-Xmx参数来使用更多内存

Edit: You should try to analyse the problem more into depth, try to use JConsole . 编辑:您应该尝试更深入地分析问题,尝试使用JConsole It helps you see the memory load. 它可以帮助您查看内存负载。

Even with Postgres you might get neaar the heap size limit, but not cross it because the loaded driver takes a bit less memory. 即使使用Postgres,您也​​可能会获得堆大小限制,但不能跨越它,因为加载的驱动程序会占用更少的内存。

With default settings your heam size limit is about half your physical memory. 使用默认设置,您的heam大小限制大约是物理内存的一半。 Try how much bigger blob you can save into postgres. 试试你可以节省多少blob到postgres。

Have you tried to define LobHandler and its version for oracle OracleLobHandler on your session factory? 您是否尝试在会话工厂中为oracle OracleLobHandler定义LobHandler及其版本?

Here is an example: 这是一个例子:

<bean id="sessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean">
    <property name="dataSource" ref="oracleDocDataSource"/>
    <property name="annotatedClasses">
        <list>
        ...
        </list>
    </property>
    <property name="lobHandler">
        <bean class="org.springframework.jdbc.support.lob.OracleLobHandler">
            <property name="nativeJdbcExtractor">
                <bean class="org.springframework.jdbc.support.nativejdbc.WebSphereNativeJdbcExtractor"/>
            </property>
        </bean>
    </property>
</bean>

UPDATE UPDATE

I've just realized that the speech is about hibernate 4. 我刚刚意识到演讲是关于休眠的4。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM