简体   繁体   中英

Process Large Files in Java

I have a requirement like, need to process records from a text file and insert/update in to a table. Following is the code that I have written. But when the records in the file are 50,000 its taking more than 30 minutes to process the records, and if the records are close to 80k, out of memory error is thrown. Can anyone please suggest a way to optimize the code that I have written to improve performance ?

public static String insertIntoCHG_PNT_Table(String FILE_NAME) throws NumberFormatException, IOException
    {
        Date DATE_INSERTED = new Date();
        String strLine = "";
        FileReader fr = new FileReader(FILE_NAME);
        BufferedReader br = new BufferedReader(fr);
        long SEQ = 0;
        double consumption = 1;
        String returnString = "";
        CHG_PNT insertObj = null;
        long KY_PREM_NO = 0;
        long KY_SPT = 0;
        String COD_COT_TYP = "";
        String DT_EFF = "";
        String TS_KY_TOT = "";
        String COD_COT = "";
        String ACL_VLE = "";
        String ACL_QTY = "";
        String WTR_VLE = "";
        String WTR_QTY = "";
        String SWG_VLE = "";
        String SWG_QTY = "";
        String CD_TYPE_ACT = "";
        String DT_TERM = "";
        String CD_STAT = "";
        String DT_STAT = "";
        String VLN_PPE_SIZ_COD = "";
        String WTR_PPE_SIZ_MTD = "";
        String SWG_PPE_SIZ_MTD = "";
        while( (strLine = br.readLine()) != null){ 
            /*
             * Meter Serial No, Property No, Current Meter Index, Previous meter index, Consumption needs to be added
             * 
             * 
             */
            String[] split = strLine.split("\\;");  
            KY_PREM_NO = Long.parseLong(split[0].trim());
            KY_SPT = Long.parseLong(split[1].trim());
            COD_COT_TYP = split[2].trim();
            DT_EFF = split[3].trim();
            TS_KY_TOT = split[4].trim();
            COD_COT = split[5].trim();
            ACL_VLE = split[6].trim();
            ACL_QTY = split[7].trim();
            WTR_VLE = split[8].trim();
            WTR_QTY = split[9].trim();
            SWG_VLE = split[10].trim();
            SWG_QTY = split[11].trim();
            CD_TYPE_ACT = split[12].trim();
            DT_TERM = split[13].trim();
            CD_STAT = split[14].trim();
            DT_STAT = split[15].trim();
            VLN_PPE_SIZ_COD = split[16].trim();
            WTR_PPE_SIZ_MTD = split[17].trim();
            SWG_PPE_SIZ_MTD = split[18].trim();

            long counter = 0;
            long newCounter = 0;
            CHG_PNT checkRecordCount = null;
            checkRecordCount = checkAndUpdateRecord(KY_PREM_NO,KY_SPT,COD_COT_TYP,TS_KY_TOT);

            try {

                if(checkRecordCount == null)
                    insertObj = new CHG_PNT();
                else
                    insertObj = checkRecordCount;
                insertObj.setKY_PREM_NO(KY_PREM_NO);
                //insertObj.setSEQ_NO(SEQ);
                insertObj.setKY_SPT(KY_SPT);
                insertObj.setCOD_COT_TYP(COD_COT_TYP);
                insertObj.setDT_EFF(DT_EFF);
                insertObj.setTS_KY_TOT(TS_KY_TOT);
                insertObj.setCOD_COT(COD_COT);
                insertObj.setACL_VLE(Double.parseDouble(ACL_VLE));
                insertObj.setACL_QTY(Double.parseDouble(ACL_QTY));
                insertObj.setWTR_VLE(Double.parseDouble(WTR_VLE));
                insertObj.setWTR_QTY(Double.parseDouble(WTR_QTY));
                insertObj.setSWG_VLE(Double.parseDouble(SWG_VLE));
                insertObj.setSWG_QTY(Double.parseDouble(SWG_QTY));
                insertObj.setCD_TYPE_ACT(CD_TYPE_ACT);
                insertObj.setDT_TERM(DT_TERM);
                insertObj.setCD_STAT(Double.parseDouble(CD_STAT));
                insertObj.setDT_STAT(DT_STAT);
                insertObj.setVLN_PPE_SIZ_COD(VLN_PPE_SIZ_COD);
                insertObj.setWTR_PPE_SIZ_MTD(WTR_PPE_SIZ_MTD);
                insertObj.setSWG_PPE_SIZ_MTD(SWG_PPE_SIZ_MTD);
                insertObj.setDATE_INSERTED(DATE_INSERTED);
                if(checkRecordCount == null)
                {
                    insertObj.setDATE_INSERTED(DATE_INSERTED);
                    insertObj.insert();
                }
                else
                {
                    insertObj.setDATE_MODIFIED(DATE_INSERTED);
                    insertObj.update();
                }
                BSF.getObjectManager()._commitTransactionDirect(true);

            }catch(Exception e)
            {
                String abc = e.getMessage();
            }

        }
        fr.close();
        br.close();
        String localPath = FILE_NAME;
        File f = new File(FILE_NAME);
        String fullPath = f.getParent();
        String fileName = f.getName();
        String SubStr1 = new String("Processing");
        int index = fullPath.lastIndexOf(SubStr1);
        String path = fullPath.substring(0, index);
        String destPath = path+"\\Archive\\"+fileName;
        PMP_PROPERTIES.copyFile(new File(localPath),new File(destPath));
        File file = new File(FILE_NAME);
        file.delete();
        return null;
    }

There are two main problems. The first one is a performance problem - and, contrary to your intuition, the problem is the database insertion speed.

You are inserting each item in a separate transaction. You should not do that if you want your inserts to be quick. Introduce a counter variable and perform a commint only each N inserts and at the end.

int commitStep = 100;
int modCount = 0;

while() {
  //... your code
  modCount++;
  if ( modCount % commitStep == 0  ) { 
    BSF.getObjectManager()._commitTransactionDirect(true);
  }
}

You can read more about sql insert speed-up here: Sql insert speed up

The second problem is, possibly, file reading scalability. It will work for smaller files, but not for larger ones. This question Read large files in Java has some good answers to your problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM