简体   繁体   中英

Validate XML file against embedded XSD resource with umlauts in C++ / MSXML

I would like to validate an XML file in C++ using the MSXML6 parser and have followed the instructions on http://msdn.microsoft.com/en-us/library/ms762774%28v=vs.85%29.aspx . However, the project I'm working on requires the XSD schema to be embedded in the binary file.

This is the XML file, which should be validated (all files simplified for demonstration purposes):

<?xml version="1.0" encoding="UTF-8"?>
<Document xsi:schemaLocation="urn:test schema.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:test">
  <Party>
    <Id>1</Id>
    <Name>Bob</Name>
    <Salary>100.00</Salary>
  </Party>
  <Party>
    <Id>2</Id>
    <Name>Alice</Name>
    <Salary>200.00</Salary>
  </Party>
  <Party>
    <Id>3</Id>
    <Name>Günther</Name>
    <Salary>300.00</Salary>
  </Party>
</Document>

And here's the XSD schema:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<xs:schema xmlns="urn:test" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:test" elementFormDefault="qualified">

  <xs:simpleType name="NameType">
    <xs:restriction base="xs:string">
      <xs:pattern value="([A-Za-z0-9ÄÖÜäöü]){1,10}"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:complexType name="PartyType">
    <xs:sequence>
      <xs:element name="Id" type="xs:integer"/>
      <xs:element name="Name" type="NameType"/>
      <xs:element name="Salary" type="xs:decimal"/>
    </xs:sequence>
  </xs:complexType>

  <xs:element name="Document">
    <xs:complexType>
      <xs:choice minOccurs="1" maxOccurs="9999999">
        <xs:element name="Party" type="PartyType"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>

</xs:schema>

The above XSD schema is embedded as a Win32 resource in the executable and can be referenced via the identifier "IDR_XSDSCHEMA1" (see comment line with OPTION 1):

#include <stdio.h>
#include <tchar.h>
#include <windows.h>
#import <msxml6.dll>
#include "resource.h"

// Macro that calls a COM method returning HRESULT value.
#define CHK_HR(stmt) do { hr=(stmt); if (FAILED(hr)) return bstrResult; } while(0)

//Method for acquiring own handle
HMODULE GetThisDllHandle()
{
    MEMORY_BASIC_INFORMATION info;
    size_t len = VirtualQueryEx(GetCurrentProcess(), (void*)GetThisDllHandle, &info, sizeof(info));
    return len ? (HMODULE)info.AllocationBase : NULL;
}

_bstr_t validateFile(_bstr_t bstrFile)
{
    //Schema collection
    MSXML2::IXMLDOMSchemaCollectionPtr pXS;

    //XML document
    MSXML2::IXMLDOMDocument2Ptr pXD;

    //XSD document
    MSXML2::IXMLDOMDocument2Ptr pXSD;

    //Validation object
    MSXML2::IXMLDOMParseErrorPtr pErr;

    _bstr_t bstrResult = L"";
    HRESULT hr = S_OK;

    //Load XSD schema from resource
    HMODULE handle = GetThisDllHandle();
    HRSRC rc = ::FindResource(handle, MAKEINTRESOURCE(IDR_XSDSCHEMA1), L"XSDSCHEMA");
    HGLOBAL rcData = ::LoadResource(handle, rc);
    LPVOID data = (::LockResource(rcData));
    ::FreeResource(rcData);

    //Load schema stream into document
    CHK_HR(pXSD.CreateInstance(__uuidof(MSXML2::DOMDocument60), NULL, CLSCTX_INPROC_SERVER));

    if (pXSD->loadXML((LPCSTR)data) != VARIANT_TRUE)
        return bstrResult;

    // Create a schema cache
    CHK_HR(pXS.CreateInstance(__uuidof(MSXML2::XMLSchemaCache60), NULL, CLSCTX_INPROC_SERVER));

    //--> OPTION 1: VALIDATING AGAINST EMBEDDED XSD RESOURCE; DOESN'T WORK <--
    CHK_HR(pXS->add(L"urn:test", pXSD.GetInterfacePtr()));

    //--> OPTION 2: VALIDATING AGAINST PHYSICAL XSD FILE; WORKS FINE <--
    //CHK_HR(pXS->add(L"urn:test", L"schema.xsd"));

    // Create a DOMDocument and set its properties.
    CHK_HR(pXD.CreateInstance(__uuidof(MSXML2::DOMDocument60), NULL, CLSCTX_INPROC_SERVER));

    pXD->async = VARIANT_FALSE;
    pXD->validateOnParse = VARIANT_TRUE;
    pXD->preserveWhiteSpace = VARIANT_TRUE;

    //Assign the schema cache to the Document's schema collection
    pXD->schemas = pXS.GetInterfacePtr();

    //Load XML file
    if(pXD->load(bstrFile) != VARIANT_TRUE)
    {
        pErr = pXD->parseError;

        bstrResult = _bstr_t(L"Validation failed on ") + bstrFile +
        _bstr_t(L"\n=====================") +
        _bstr_t(L"\nReason: ") + _bstr_t(pErr->Getreason()) +       
        _bstr_t(L"\nSource: ") + _bstr_t(pErr->GetsrcText()) +
        _bstr_t(L"\nLine: ") + _bstr_t(pErr->Getline());
    }

    else
    {
        bstrResult = _bstr_t(L"Validation succeeded for ") + bstrFile +
        _bstr_t(L"\n======================\n") +
        _bstr_t(pXD->xml);
    }

    return bstrResult;    
}

int _tmain(int argc, _TCHAR* argv[])
{
    HRESULT hr = CoInitialize(NULL);
    if(SUCCEEDED(hr))
    {
        try
        {
            _bstr_t bstrOutput = validateFile(L"Document.xml");
            MessageBox(NULL, bstrOutput, L"schemaCache",MB_OK);
        }

        catch(_com_error &e)
        {
              MessageBox(NULL, e.Description(), L"schemaCache",MB_OK);
        }
        CoUninitialize();
    }
    return 0;
}

Unfortunately, I have encountered some strange behavior while trying to run the validation routine (OPTION 1 comment). It seems, that the umlauts in the XSD resource aren't properly decoded while being loaded into the stream. This results in a messed up validation reference, as seen in the following result:

失败

However, when the schema file is loaded directly from disk (OPTION 2 comment), the validation routine runs just fine:

OK2

I have already tried to convert the loaded stream from Unicode to Multi-Byte and vice versa, however to no avail. Is there something I'm missing here? Or are Win32 resources limited to a specific character set? Thanks for any suggestions.

请参见WhozCraig的评论:将MultiByteToWideChar()CP_UTF8用作输入参数将返回有效的Unicode字符串。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM