简体   繁体   中英

Inserting XML documents into SQL Server 2008 database

I need help inserting xml files into SQL Server 2008.

I have the following SQL statement:

insert into dbo.articles(id, title, contents)
  SELECT  X.article.query('id').value('.', 'INT'),
        X.article.query('article').value('.', 'VARCHAR(50)'),
        X.article.query('/doc/text()').value('.', 'VARCHAR(MAX)')
   FROM (
     SELECT CAST(x AS XML)
     FROM OPENROWSET(
           BULK 'E:\test\test_files\1000006.xml',
           SINGLE_BLOB) AS T(x)
        ) AS T(x)
CROSS APPLY x.nodes('doc') AS X(article);

which basically shreds an XML doc into a columns. However, I want to be able to insert all the files in a folder, and not manually specify the file, as in this case E:\\test\\test_files\\1000006.xml

Ok, first crack at answering a question in stackoverflow...

You have two issues:- firstly getting the filenames from the folder into a SQL table or table variable, and then reading the XML from each.

The first is easy, if you don't mind using xp_cmdshell

DECLARE @Folder         VARCHAR(255)    = 'C:\temp\*.xml'
DECLARE @Command        VARCHAR(255)
DECLARE @FilesInAFolder TABLE  (XMLFileName VARCHAR(500))

--
SET @Command = 'DIR ' + @Folder + ' /TC /b'
--
INSERT INTO @FilesInAFolder
EXEC MASTER..xp_cmdshell @Command
--
SELECT * FROM @FilesInAFolder
WHERE XMLFileName IS NOT NULL

The second part, converting the XML files to SQL rows is a little trickier because BULK INSERT won't take a parameter and you can't BULK INSERT into an XML table type. Here's code that works for ONE file...

DECLARE @x              xml
DECLARE @Results        TABLE  (result xml)
DECLARE @xmlFileName    NVARCHAR(300) = 'C:\temp\YourXMLFile.xml'
DECLARE @TempTable      TABLE 
    (
    ID                  INT,        
    Article             NVARCHAR(50),
    doctext             NVARCHAR(MAX)
    )   

/* ---- HAVE TO USE DYNAMIC sql BECAUSE BULK INSERT WON'T TAKE A PARAMETER---------*/
DECLARE @sql NVARCHAR(4000) =
 'SELECT * FROM OPENROWSET ( BULK ''' + @xmlFileName + ''', SINGLE_BLOB )AS xmlData'

/* ---- have to use a normal table variable because we can't directly bulk insert
        into an XML type table variable  ------------------------------------------*/
INSERT INTO @results EXEC(@SQL)

SELECT @x = result FROM @Results

/* ---- this is MUCH faster than using a cross-apply ------------------------------*/
INSERT INTO @TempTable(ID,Article,doctext)                                              
SELECT 
                x.value('ID[1]',        'INT'           ),      
                x.value('Article[1]',   'NVARCHAR(50)'  ),                      
                x.value('doctext[1]',   'NVARCHAR(MAX)' )   
FROM @x.nodes(N'/doc')      t(x) 

SELECT * FROM @TempTable

Now the hard bit is putting these two together. I tried several ways to get this code into a function but you can't use dynamic SQL or EXEC in a function and you can't call an SP from a function and you can't put the code into two separate SPs because you can't have cascading EXEC statements ie you try and EXEC an SP with the above code in it that also has an EXEC in it, so... you have to either use a cursor to put the two code blocks above together ie cursor through the @FilesInAFolder passing each XMLFileName value into the second code block as variable @XMLFileName or you use SSIS or CLR.

Sorry I ran out of time to build a complete SP with a directory name as a parameter and a cursor but that is pretty straightforward. Phew!

Are you using a stored procedure? You can specify the file name as a parameter.

Something like...

CREATE PROCEDURE sp_XMLLoad
   @FileName
AS SET NOCOUNT ON
SELECT  X.article.query('id').value('.', 'INT'),
        X.article.query('article').value('.', 'VARCHAR(50)'),   
        X.article.query('/doc/text()').value('.', 'VARCHAR(MAX)')
FROM (
      SELECT CAST(x AS XML)
      FROM OPENROWSET(
            BULK @FileName,
            SINGLE_BLOB) AS T(x)

Not exactly like that ... you'll need to add quotes around the @Filename I bet. Maybe assemble it with quotes and then use that variable.

If you're using SSIS, you can then pump all the files from a directory to the stored procedure, or to the SSIS code used.

I think you can do it with a cursor and xp_cmdshell. I would not recommend to ever use xp_cmdshell though.

DECLARE @FilesInAFolder TABLE  (FileNames VARCHAR(500))
DECLARE @File VARCHAR(500)
INSERT INTO @FilesInAFolder
EXEC MASTER..xp_cmdshell 'dir /b c:\'


DECLARE CU CURSOR FOR 
SELECT 'c:\' + FileNames
FROM @FilesInAFolder
WHERE RIGHT(FileNames,4) = '.xml'

OPEN CU
FETCH NEXT FROM CU INTO @File
WHILE @@FETCH_STATUS = 0
BEGIN
    INSERT INTO dbo.articles(id, title, contents)
    SELECT  X.article.query('id').value('.', 'INT'),
            X.article.query('article').value('.', 'VARCHAR(50)'),
            X.article.query('/doc/text()').value('.', 'VARCHAR(MAX)')
    FROM (
            SELECT CAST(x AS XML)
            FROM OPENROWSET(
                    BULK @File,
                    SINGLE_BLOB) AS T(x)
         ) AS T(x)
    CROSS APPLY x.nodes('doc') AS X(article);

    FETCH NEXT FROM CU INTO @File
END

CLOSE CU
DEALLOCATE CU

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM