[英]Read large file (Java Heap Space)
I want to read CSV file, create objects from every rows and then save these objects to a database. 我想读取CSV文件,从每一行创建对象,然后将这些对象保存到数据库中。 When i read all lines from my file, and store all objects inside ArrayList i get Java Heap Space Error.
当我从文件中读取所有行并将所有对象存储在ArrayList中时,出现Java堆空间错误。 I tried to save every record immediately after reading, but then saving records by Hibernate method save() take a lot of time.
我试图在读取后立即保存每条记录,但是通过Hibernate方法save()保存记录会花费很多时间。
I also tried to check size of my arrayList and save data when this size equals 100k (commented part of code). 我还尝试检查arrayList的大小并在此大小等于100k(代码的注释部分)时保存数据。
Question: Is any way to read file partly or better way to store data in Java? 问题:是否有部分读取文件的方法或以Java存储数据的更好方法?
String[] colNames;
String[] values;
String line;
Map<Object1, Object1> newObject1Objects = new HashMap<Object1, Object1>();
Map<Object1, Integer> objIdMap = objDao.createObjIdMap();
StringBuilder raportBuilder = new StringBuilder();
Long lineCounter = 1L;
BufferedReader reader = new BufferedReader(new InputStreamReader(
new FileInputStream(filename), "UTF-8"));
colNames = reader.readLine().split(";");
int columnLength = colNames.length;
while ((line = reader.readLine()) != null) {
lineCounter++;
line = line.replace("\"", "").replace("=", "");
values = line.split(";", columnLength);
// Object1
Object1 object1 = createObject1Object(values);
if (objIdMap.containsKey(object1)) {
object1.setObjId(objIdMap.get(object1));
} else if (newObject1Objects.containsKey(object1)) {
object1 = newObject1Objects.get(object1);
} else {
newObject1Objects.put(object1, object1);
}
// ==============================================
// Object2
Object2 object2 = createObject2Object(values, object1,
lineCounter, raportBuilder);
listOfObject2.add(object2);
/*
logger.error("listOfObject2.size():"+listOfObject2.size());
if(listOfObject2.size() % 100000 == 0){
object2Dao.performImportOperation(listOfObject2);
listOfObject2.clear();
}
*/
}
object2Dao.performImportOperation(listOfObject2);
Increase of max heap size won't help you if you want to process really large files. 如果要处理很大的文件,最大堆大小的增加将无济于事。 Your friend is
batching
. 您的朋友正在
batching
。
Hibernate doesn't implicitly employ JDBC batching and each INSERT and UPDATE statement is executed separately. Hibernate不会隐式使用JDBC批处理,并且每个INSERT和UPDATE语句都是分别执行的。 Read "How do you enable batch inserts in hibernate?"
阅读“如何在休眠中启用批量插入?” to get information on how to enable it.
获取有关如何启用它的信息。
Pay attention to IDENTITY generators, as it disables batch fetching . 注意IDENTITY生成器,因为它会禁用批量获取 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.