简体   繁体   English

OutOfMemoryError异常:Java堆空间,如何调试......?

[英]OutOfMemoryError exception: Java heap space, how to debug…?

I'm getting a java.lang.OutOfMemoryError exception: Java heap space. 我得到一个java.lang.OutOfMemoryError异常:Java堆空间。

I'm parsing a XML file, storing data and outputting a XML file when the parsing is complete. 我正在解析XML文件,在解析完成时存储数据并输出XML文件。

I'm bit surprised to get such error, because the original XML file is not long at all. 我有点惊讶得到这样的错误,因为原始的XML文件根本不长。

Code: http://d.pr/RSzp File: http://d.pr/PjrE 代码: http//d.pr/RSzp文件: http//d.pr/PjrE

Could try setting the (I'm assuming your using Eclipse) -Xms and -Xmx values higher in your eclipse.ini file. 可以尝试在eclipse.ini文件中设置(我假设你使用Eclipse)-Xms和-Xmx值更高。

ex) EX)

-vmargs -vmargs

-Xms128m //(initial heap size) -Xms128m //(初始堆大小)

-Xmx256m //(max heap size) -Xmx256m //(最大堆大小)

If this is a one-off thing that you just want to get done, I'd try Jason's advice of increasing the memory available to Java. 如果这是你想要完成的一次性事情,我会尝试Jason的建议,增加Java可用的内存。

You are building a very large list of objects and then looping through that list to output a String, then writing that String to a file. 您正在构建一个非常大的对象列表,然后循环遍历该列表以输出String,然后将该String写入文件。 The list and the String are probably the reasons for your high memory usage. 列表和字符串可能是您使用高内存的原因。 You could reorganise your code in a more stream-oriented way. 您可以以更加面向流的方式重新组织代码。 Open your file output at the start, then write the XML for each Centroid as they are parsed. 在开始时打开文件输出,然后在解析每个Centroid时为其写入XML。 Then you wouldn't need to keep a big list of them, and you wouldn't need to hold a big String representing all the XML. 然后你不需要保留它们的大列表,并且你不需要持有代表所有XML的大字符串。

Dump the heap and analyze it. 转储堆并进行分析。 You can configure automatic heap dump on memory error using -XX:+HeapDumpOnOutOfMemoryError system property. 您可以使用-XX:+HeapDumpOnOutOfMemoryError系统属性在内存错误上配置自动堆转储。

http://www.oracle.com/technetwork/java/javase/index-137495.html http://www.oracle.com/technetwork/java/javase/index-137495.html

https://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat https://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat

http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_with http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_with

Short answer to explain why you have an OutOfMemoryError, for every centroid found in the file you loop over the already "registered" centroids to check if it is already known (to add a new one or to update the already registered one). 简短回答解释为什么你有一个OutOfMemoryError,对于你在已经“注册”的质心上循环的文件中找到的每个质心来检查它是否已经知道(添加新的或更新已经注册的质心)。 But for every failed comparison you add a new copy of the new centroid. 但是对于每次失败的比较,您都会添加新质心的新副本。 So for every new centroid it add it as many times as there are already centroids in the list then you encounter the first one you added, you update it and you leave the loop... 因此,对于每个新的质心,它添加它的次数与列表中已有的质心一样多,那么你会遇到你添加的第一个,你更新它并离开循环......

Here is some refactored code: 这是一些重构的代码:

public class CentroidGenerator {

    final Map<String, Centroid> centroids = new HashMap<String, Centroid>();

    public Collection<Centroid> getCentroids() {
        return centroids.values();
    }

    public void nextItem(FlickrDoc flickrDoc) {

        final String event = flickrDoc.getEvent();
        final Centroid existingCentroid = centroids.get(event);

        if (existingCentroid != null) {
            existingCentroid.update(flickrDoc);
        } else {
            final Centroid newCentroid = new Centroid(flickrDoc);
            centroids.put(event, newCentroid);
        }
    }


    public static void main(String[] args) throws IOException, SAXException {

        // instantiate Digester and disable XML validation
        [...]


        // now that rules and actions are configured, start the parsing process
        CentroidGenerator abp = (CentroidGenerator) digester.parse(new File("PjrE.data.xml"));

        Writer writer = null;

        try {
            File fileOutput = new File("centroids.xml");
            writer = new BufferedWriter(new FileWriter(fileOutput));
            writeOuput(writer, abp.getCentroids());

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (writer != null) {
                    writer.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }

    private static void writeOuput(Writer writer, Collection<Centroid> centroids) throws IOException {

        writer.append("<?xml version='1.0' encoding='utf-8'?>" + System.getProperty("line.separator"));
        writer.append("<collection>").append(System.getProperty("line.separator"));

        for (Centroid centroid : centroids) {
            writer.append("<doc>" + System.getProperty("line.separator"));

            writer.append("<title>" + System.getProperty("line.separator"));
            writer.append(centroid.getTitle());
            writer.append("</title>" + System.getProperty("line.separator"));

            writer.append("<description>" + System.getProperty("line.separator"));
            writer.append(centroid.getDescription());
            writer.append("</description>" + System.getProperty("line.separator"));

            writer.append("<time>" + System.getProperty("line.separator"));
            writer.append(centroid.getTime());
            writer.append("</time>" + System.getProperty("line.separator"));

            writer.append("<tags>" + System.getProperty("line.separator"));
            writer.append(centroid.getTags());
            writer.append("</tags>" + System.getProperty("line.separator"));

            writer.append("<geo>" + System.getProperty("line.separator"));

            writer.append("<lat>" + System.getProperty("line.separator"));
            writer.append(centroid.getLat());
            writer.append("</lat>" + System.getProperty("line.separator"));

            writer.append("<lng>" + System.getProperty("line.separator"));
            writer.append(centroid.getLng());
            writer.append("</lng>" + System.getProperty("line.separator"));

            writer.append("</geo>" + System.getProperty("line.separator"));

            writer.append("</doc>" + System.getProperty("line.separator"));

        }
        writer.append("</collection>" + System.getProperty("line.separator") + System.getProperty("line.separator"));

    }

    /**
     * JavaBean class that holds properties of each Document entry. It is important that this class be public and
     * static, in order for Digester to be able to instantiate it.
     */
    public static class FlickrDoc {
        private String id;
        private String title;
        private String description;
        private String time;
        private String tags;
        private String latitude;
        private String longitude;
        private String event;

        public void setId(String newId) {
            id = newId;
        }

        public String getId() {
            return id;
        }

        public void setTitle(String newTitle) {
            title = newTitle;
        }

        public String getTitle() {
            return title;
        }

        public void setDescription(String newDescription) {
            description = newDescription;
        }

        public String getDescription() {
            return description;
        }

        public void setTime(String newTime) {
            time = newTime;
        }

        public String getTime() {
            return time;
        }

        public void setTags(String newTags) {
            tags = newTags;
        }

        public String getTags() {
            return tags;
        }

        public void setLatitude(String newLatitude) {
            latitude = newLatitude;
        }

        public String getLatitude() {
            return latitude;
        }

        public void setLongitude(String newLongitude) {
            longitude = newLongitude;
        }

        public String getLongitude() {
            return longitude;
        }

        public void setEvent(String newEvent) {
            event = newEvent;
        }

        public String getEvent() {
            return event;
        }
    }

    public static class Centroid {
        private final String event;
        private String title;
        private String description;

        private String tags;

        private Integer time;
        private int nbTimeValues = 0; // needed to calculate the average later

        private Float latitude;
        private int nbLatitudeValues = 0; // needed to calculate the average later
        private Float longitude;
        private int nbLongitudeValues = 0; // needed to calculate the average later

        public Centroid(FlickrDoc flickrDoc) {
            event = flickrDoc.event;
            title = flickrDoc.title;
            description = flickrDoc.description;
            tags = flickrDoc.tags;
            if (flickrDoc.time != null) {
                time = Integer.valueOf(flickrDoc.time.trim());
                nbTimeValues = 1; // time is the sum of one value
            }            
            if (flickrDoc.latitude != null) {
                latitude = Float.valueOf(flickrDoc.latitude.trim());
                nbLatitudeValues = 1; // latitude is the sum of one value
            }
            if (flickrDoc.longitude != null) {
                longitude = Float.valueOf(flickrDoc.longitude.trim());
                nbLongitudeValues = 1; // longitude is the sum of one value
            }
        }

        public void update(FlickrDoc newData) {
            title = title + " " + newData.title;
            description = description + " " + newData.description;
            tags = tags + " " + newData.tags;
            if (newData.time != null) {
                nbTimeValues++;
                if (time == null) {
                    time = 0;
                }
                time += Integer.valueOf(newData.time.trim());
            }
            if (newData.latitude != null) {
                nbLatitudeValues++;
                if (latitude == null) {
                    latitude = 0F;
                }
                latitude += Float.valueOf(newData.latitude.trim());
            }
            if (newData.longitude != null) {
                nbLongitudeValues++;
                if (longitude == null) {
                    longitude = 0F;
                }
                longitude += Float.valueOf(newData.longitude.trim());
            }
        }

        public String getTitle() {
            return title;
        }

        public String getDescription() {
            return description;
        }

        public String getTime() {
            if (nbTimeValues == 0) {
                return null;
            } else {
                return Integer.toString(time / nbTimeValues);
            }
        }

        public String getTags() {
            return tags;
        }

        public String getLat() {
            if (nbLatitudeValues == 0) {
                return null;
            } else {
                return Float.toString(latitude / nbLatitudeValues);
            }
        }

        public String getLng() {
            if (nbLongitudeValues == 0) {
                return null;
            } else {
                return Float.toString(longitude / nbLongitudeValues);
            }
        }

        public String getEvent() {
            return event;
        }
    }
}

Answering the question "How to Debug" 回答“如何调试”的问题

It starts with gathering the information that's missing from your post. 首先是收集帖子中遗漏的信息。 Information that could potentially help future people having the same problem. 可能有助于未来人们遇到同样问题的信息。

First, the complete stack trace. 首先,完整的堆栈跟踪。 An out-of-memory exception that's thrown from within the XML parser is very different from one thrown from your code. 从XML解析器中抛出的内存不足异常与从代码中抛出的异常非常不同。

Second, the size of the XML file, because "not long at all" is completely useless. 其次,XML文件的大小,因为“不长时间”完全没用。 Is it 1K, 1M, or 1G? 是1K,1M还是1G? How many elements. 有多少元素。

Third, how are you parsing? 第三,你是如何解析的? SAX, DOM, StAX, something completely different? SAX,DOM,StAX,完全不同的东西?

Fourth, how are you using the data. 第四,你是如何使用这些数据的。 Are you processing one file or multiple files? 您正在处理一个文件还是多个文件? Are you accidentally holding onto data after parsing? 解析后是否意外地保留了数据? A code sample would help here (and a link to some 3rd-party site isn't terribly useful for future SO users). 代码示例在这里会有所帮助(并且指向某些第三方网站的链接对于未来的SO用户来说并不是非常有用)。

Ok, I'll admit I'm avoiding your direct question with a possible alternative. 好吧,我承认我会用一种可能的替代方法来避免你的直接问题。 You might want to consider parsing with XStream instead to let it deal with the bulk of the work with less code. 您可能需要考虑使用XStream进行解析,而不是让它以较少的代码处理大部分工作。 My rough example below parses your XML with a 64MB heap. 下面我粗略的例子用64MB堆解析你的XML。 Note that it requires Apache Commons IO as well just to easily read the input just to allow the hack to turn the <collection> into a <list> . 请注意,它还需要Apache Commons IO才能轻松读取输入,以便让hack将<collection>转换为<list>

import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.commons.io.FileUtils;

import com.thoughtworks.xstream.XStream;
import com.thoughtworks.xstream.annotations.XStreamAlias;

public class CentroidGenerator {
    public static void main(String[] args) throws IOException {
        for (Centroid centroid : getCentroids(new File("PjrE.data.xml"))) {
            System.out.println(centroid.title + " - " + centroid.description);
        }
    }

    @SuppressWarnings("unchecked")
    public static List<Centroid> getCentroids(File file) throws IOException {
        String input = FileUtils.readFileToString(file, "UTF-8");
        input = input.replaceAll("collection>", "list>");

        XStream xstream = new XStream();
        xstream.processAnnotations(Centroid.class);

        Object output = xstream.fromXML(input);
        return (List<Centroid>) output;
    }

    @XStreamAlias("doc")
    @SuppressWarnings("unused")
    public static class Centroid {
        private String id;
        private String title;
        private String description;
        private String time;
        private String tags;
        private String latitude;
        private String longitude;
        private String event;
        private String geo;
    }
}

I downloaded your code, something that I almost never do. 我下载了你的代码,这是我几乎从未做过的事情。 And I can say with 99% certainty that the bug is in your code: an incorrect "if" inside a loop. 我可以肯定地说99%的错误是在你的代码中:循环中的错误“if”。 It has nothing whatsoever to do with Digester or XML. 它与Digester或XML没有任何关系。 Either you've made a logic error or you didn't fully think through just how many objects you'd create. 要么你犯了一个逻辑错误,要么你没有完全考虑你创造了多少个对象。

But guess what: I'm not going to tell you what your bug is. 但是猜猜:我不会告诉你你的bug是什么。

If you can't figure it out from the few hints that I've given above, too bad. 如果你不能从我上面给出的一些提示中弄清楚,那太糟糕了。 It's the same situation that you put all of the other respondents through by not providing enough information -- in the original post -- to actually start debugging. 通过在原始帖子中提供足够的信息来实际开始调试,你把所有其他受访者都放在了同样的情况。

Perhaps you should read -- actually read -- my former post, and update your question with the information it requests. 也许你应该阅读 - 实际阅读 - 我以前的帖子,并用它要求的信息更新你的问题。 Or, if you can't be bothered to do that, accept your F. 或者,如果您不愿意这样做,请接受您的F.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM