简体   繁体   中英

C# XSLT Transformation Out of Memory

All,

I have the below code for Transforming an XML Document using an XSLT. The problem is when the XML Document is around 12MB the C# runs out of memory. Is there a different way of doing the transform without consuming that much memory?

public string Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)
    {
        try
        {
            var stm = new MemoryStream();
            myXslTrans.Transform(myXPathDoc, null, stm);
            var sr = new StreamReader(stm);
            return sr.ReadToEnd();
        }
        catch (Exception e)
        {
          //Log the Exception
        }
    }

Here is the stack trace:

at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32       length, Int32 capacity)
at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)   
at System.Text.StringBuilder.Append(Char[] value, Int32 startIndex, Int32 charCount)
at System.IO.StreamReader.ReadToEnd()
at Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)

The first thing I would do is to isolate the problem. Take the whole MemoryStream business out of play and stream the output to a file, eg:

using (XmlReader xr = XmlReader.Create(new StreamReader("input.xml")))
using (XmlWriter xw = XmlWriter.Create(new StreamWriter("output.xml")))
{
   xslt.Transform(xr, xw);
}

If you still get an out-of-memory exception (I'd bet folding money that you will), that's a pretty fair indication that the problem's not with the size of the output but rather with something in the transform itself, eg something that recurses infinitely like:

<xsl:template match="foo">
   <bar>
      <xsl:apply-templates select="."/>
   </bar>
</xsl:template>

The MemoryStream + ReadToEnd means you need 2 copies in memory at that point. You could optimize that to 1 copy by using a StringWriter object as target (replacing MemStream + Reader) and use the writer.ToString() when you're done.

But that would get you only up to 24 MB at best, still way too small. Something else must be going on.
Impossible to say what, maybe your XSLT is too complicated or inefficient.


var writer = new StringWriter();
//var stm = new MemoryStream();
myXslTrans.Transform(myXPathDoc, null, writer);
//var sr = new StreamReader(stm);
//return sr.ReadToEnd();
return writer.ToString();

You need

stm.Position = 0

to reset the memory stream to the beginning before reading the contents with the StreamReader. Otherwise you are trying to read content from past the end of the stream.

The ReadToEnd() function loads the entire stream into memory. You are better off using an XmlReader to stream the document in chunks, and then run xslt against smaller fragments. You may also want to consider passing the document with XmlReader entirely and not use xslt which is less suited to streaming data and less scalable for large files.

It may or may not be related but you need to make sure you dispose your stream and reader objects. I have also added in the position = 0 that Nick Jones pointed out.

public string Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)
{
    try
    {
        using (var stm = new MemoryStream())
        {
             myXslTrans.Transform(myXPathDoc, null, stm);
             stm.Position = 0;
             using (var sr = new StreamReader(stm))
             {
                 return sr.ReadToEnd();
             }
        }
    }
    catch (Exception e)
    {
        //Log the Exception
    }
}

Make sure you don't have any JavaScript, otherwise there is a known memory leak.

My response has validity and can avoid many errors and memory leaks. A user voted me down because he did not understand that JavaScript can be embedded in XSLT as an extension.

Here is an old article that explains how to do it. http://msdn.microsoft.com/en-us/magazine/cc302079.aspx

.Net classes, which are hosted on a web server, have known memory leaks when using XslTransform class when JavaScript is embedded in the XSLT document via an extension. JavaScript was used to get things like dates and do some more dynamic processing. That is why I am giving a warning to those who use the JavaScript extension. This is the most likely reason for a memory leak.

Another warning is to be careful using the the newer XslCompliedTransform class. With my large XSLT documents I profiled the processor at 4 times the XslTransform class and twice its memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM