简体   繁体   中英

PDF file encode to base64 take more time if 100k documents are to be encode

Am trying to encode pdf documents to base64, If it is less in number ( like 2000 documents) its working nicely. But am having 100k plus doucments to be encode.

Its take more time to encode all those files. Is there any better approach to encode large data set.?

Please find my current approach

 String filepath=doc.getPath().concat(doc.getFilename());

 file = new File(filepath);
    if(file.exists() && !file.isDirectory()) {
        try {
            FileInputStream fileInputStreamReader = new FileInputStream(file);
            byte[] bytes = new byte[(int) file.length()];
            fileInputStreamReader.read(bytes);
            encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
            fileInputStreamReader.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

Try this:

  1. Figure out how many files you need to encode.

     int files = Files.list(Paths.get(directory)).count(); 
  2. Split them up into a reasonable amount that a thread can handle in java. IE) If you have 100k files to encode. Split it into 1000 lists of 1000, something like that.

     int currentIndex = 0; for (File file : filesInDir) { if (fileMap.get(currentIndex).size() >= cap) currentIndex++; fileMap.get(currentIndex).add(file); } /** Its going to take a little more effort than this, but its the idea im trying to show you*/ 
  3. Execute each worker thread one after another if the computers resources are available.

     for (Integer key : fileMap.keySet()) { new WorkerThread(fileMap.get(key)).start(); } 

You can check the current resources available with:

 public boolean areResourcesAvailable() {
     return imNotThatNice();
 }

/**
 * Gets the resource utility instance
 * 
 * @return the current instance of the resource utility
 */
private static OperatingSystemMXBean getInstance() {
    if (ResourceUtil.instance == null) {
        ResourceUtil.instance = ManagementFactory.getOperatingSystemMXBean();
    }
    return ResourceUtil.instance;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM