简体   繁体   中英

Wordlist generator, heap size error in Java

I'm trying to create a program which generates a word list based on a couple (10-100) original input words. The end result contains millions, possibly billions of lines, with one word on each line. I've come far enough that I can generate up to about 5 million or so words, but whenever I run something that would generate far more words, like 100 million or so, the program crashes after roughly 1 min and 9 seconds. Here is the error output:

    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3210)
    at java.util.Arrays.copyOf(Arrays.java:3181)
    at java.util.ArrayList.grow(ArrayList.java:265)
    at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239)
    at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231)
    at java.util.ArrayList.add(ArrayList.java:462)
    at wordlistgen.WordlistGen2.combineWords(WordlistGen2.java:129)
    at wordlistgen.WordlistGen2.main(WordlistGen2.java:25)
    /home/NAME/.cache/netbeans/8.1/executor-snippets/run.xml:53: Java 
returned: 1
BUILD FAILED (total time: 1 minute 9 seconds)

I have tried to increase the heap size for Netbeans by entering -J-Xms1024m -J-Xmx2048m in my netbeans.conf file (Running Ubuntu 17.10), but the error persists.

Essentially what the program does is import the original 10-100 words:

static void importList() throws IOException{
    ArrayList<String> rawList = new ArrayList<>();

    try(BufferedReader br = new BufferedReader(new FileReader("textfile"))) {
        for(String line; (line = br.readLine()) != null; ) {
            rawList.add(line);
        }

        listOfLists.add(rawList);
        loll++;
    }

}

Then, with a bunch of for loops I create new variations of words with capitalized letters, numbers at the end, substrings of the entire word, and so on. The words are stored in different arraylists, which are in turn stored in an ArrayList of ArrayLists. So in an ArrayList.

When I'm done combining and manipulating words, I output the entire final arraylist, line by line, to an output file, using the following method:

static void outputFile(String fileName) throws IOException{
    try (FileWriter writer = new FileWriter(fileName)) {
        for(String str: finalList) {
            writer.write(str +"\n");
        }
    }
}

The entire code can be found here: https://pastebin.com/0fkvwYbx

I'm hoping that I'm missing something obvious, or that I've misinterpreted the error message, either way, if someone could find a solution so that I am able to generate longer lists, I'd be very grateful.

Maybe ArrayList is not the appropiate List implementation for your problem. Please see: When to use LinkedList over ArrayList?

I think you are constantly hitting the worst-case scenario when (citing)

add(E element) is O(1) amortized, but O(n) worst-case since the array must be resized and copied

Not only inefficient in time, but also in memory, since you are constantly needing duplicated huge backing arrays for your ArrayLists. Consider using LinkedList, specially since your code does not appear to do random access by index to the lists

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM