简体   繁体   中英

Sql Server 2008 R2 - Shredding XML with unknown Schema

Ok so I am pretty sure I am going to have to use an EDGE table produced from OPENXML. Just wanted to check that there isn't a better way.

This is XML I am pulling from an http API directly into SQL using OA and MSXML. I have written the import stored procedures and have the xml stored as an XML datatype in a table. This is survey response data, and since every survey is different and can change over time the elements/columns of a response unknown. They do provide the metadata of the survey and it gets me about 70% of the way towards a schema but there are element names under responses that don't exist in their metadata. Which I attribute to them adding more functionality to the survey builder with more objects and not accounting for that in their API.

So basically

<xml>
  <response>
    <ResponseID>1</ReponseID>
    <Question1>Yes</Question1>
    <Question1_1_tag1>99</Question1>
  </response>
</xml>
  • Response in reality contain a lot more elements, if a survey questionnaire had 100 questions there would be at least 100 elements on

So I can get ResponseID and Question1 from their metadata, but I am going to need to shred Question1_tag1 into a column for any given survey and they provide no xsd and Question1_tag1 exists nowhere else in their metadata but it is absolutely data that I need to capture and this happens differently in every result set from a different survey, I need its name as a column and to identify the right datatype.

Just a note I went into the business logic here because in everything I have read researching this problem appears to be pretty rare and usually when you run into this issue its a matter of getting the requirements and getting well formatted data and just wanted to explain that in this case I really can't.

So again pretty sure I have to write custom t-sql to shred the xml using an edge table. Was just curious if anyone could think of a better way.

I don't think that is possible to do in SQL without make huge strings manipulations to gather the schemas, and then using dynamic SQL to extract the data. if you can change the XML schema to be generic something below, then parsing from in SQL will be piece of cake:
Option 1:

<xml>
    <response ResponseID="1">
        <Question QuestionID="1" QuestionTagID="1" ResponseValue="Yes" QuestionTagValue="99" />
        <Question QuestionID="2" QuestionTagID="3" ResponseValue="Perhaps" QuestionTagValue="91" />
    </response>
</xml>

Option 2:

<xml>
    <Response ID="1">
        <Question ID="1">
            <ResponseValue>Yes</ResponseValue>
            <QuestionTagID>1</QuestionTagID>
            <QuestionTagValue>1</QuestionTagValue>
        </Question>
        <Question ID="2">
            <ResponseValue>Perhaps</ResponseValue>
            <QuestionTagID>3</QuestionTagID>
            <QuestionTagValue>1</QuestionTagValue>
        </Question>
    </Response>
</xml>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM