简体   繁体   中英

Large Xml files are being truncated by MSXML4 / FreeThreadedDOMDocument40 (COM string Interop issue)

I'm using the following code to load a large Xml document (~5 MB):

int _tmain(int argc, _TCHAR* argv[])
{
    ::CoInitialize(NULL);

    HRESULT hr;
    CComPtr< IXMLDOMDocument > spXmlDocument;
    hr = spXmlDocument.CoCreateInstance(__uuidof(FreeThreadedDOMDocument60)), __uuidof(FreeThreadedDOMDocument60);
    if(FAILED(hr)) return FALSE;

    spXmlDocument->put_preserveWhiteSpace(VARIANT_TRUE);
    spXmlDocument->put_async(VARIANT_FALSE);
    spXmlDocument->put_validateOnParse(VARIANT_FALSE);

    VARIANT_BOOL bLoadSucceeded = VARIANT_FALSE;
    hr = spXmlDocument->load( CComVariant( L"C:\\XMLFile1.xml" ), &bLoadSucceeded );

    if(FAILED(hr) || bLoadSucceeded==VARIANT_FALSE) return FALSE;

    CComVariant bstrDoc;
    hr = spXmlDocument->get_nodeValue(&bstrDoc);

    CComPtr< IXMLDOMNode > spNode;
    hr = spXmlDocument->selectSingleNode(CComBSTR(L"//SpecialNode"), &spNode );
}

I'm finding that the contents of bstrDoc is truncated (there are no exceptions / failed HResults)

Anyone know why? You can try this yourself just by creating a large Xml file of just <xml></xml> elements (~5 MB should do it)

UPDATE: Updating to use MSXML 6 made no difference, also setting Async to false and using get_nodeValue / get_text made no difference (sample updated)

I noticed that if I did selectSingleNode for a node placed at the end of the document it worked fine - it appears that the document loads successfully, and the issue is instead with getting the text for a single node. I'm perplexed however as I'm yet to find anyone else on the internet having this issue.

UPDATE 2: The problem appears to be related to COM interop itself - I've created a simple C# class that does the same thing and exposed it as a COM object. I can see that although the Xml is fine in my C# app, by the time I look at it in my debugger in the C++ app it looks exactly as it did when using MSXML.

It appears I was a victim of my own foolishness - the Xml / strings were in fact not being truncated, the viewer in Visual Studio was simply lying to me.

Outputting the strings to a file showed that the strings were all as they should be.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM