简体   繁体   中英

How do I parse a delimited string into columns using only string functions (SQL Server 2008 R2)?

I want to parse a delimited string from a database field into multiple columns. The string could have from exactly 0 to 7 components delimited by a special character (char(7) in my particular case). There will not be more than seven components; if there are, they would be ignored and included with delimiter(s) in the last component. I need to do this without using UDF's or T-SQL. I don't think the XML parsing functionality is a good fit for this, but I would consider an efficient solution.

That leaves the string manipulation functions. Since I'm in SQL Server 2008 R2, the string_split() function is not an option. The brute force approach (below) seems to work, but it's unwieldy and unreadable. I'd be interested in any improvement on it.

create table #x (a int, delimited_value varchar(8000))
insert into #x values 
 (1, 'abc')
,(2, 'defgh' + char(7) + 'ij' + char(7) + 'klmnop')
,(3, '')
,(4, 'qr' + char(7) + 's' + char(7) + 't' + char(7) + 'u' + char(7) + 'v' + char(7) + 'w' + char(7) + 'xyz')
,(5, '012' + char(7) + char(7) + '3' + char(7))
,(6, char(7) + char(7) + '4567' + char(7) + char(7) + '89')

select a
,substring(delimited_value, 1, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) - 1) as component1
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0) + 1, 8000), isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8001) - isnull(nullif(charindex(char(7), delimited_value), 0), 8000) - 1) as component2
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0) + 1, 8000), isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8001) - isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) - 1) as component3
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0) + 1, 8000), isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8001) - isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) - 1) as component4
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0) + 1, 8000), isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8001) + 1 ), 0), 8001) - isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) - 1) as component5
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0) + 1, 8000), isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8001) - isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) - 1) as component6
,substring(delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value, isnull(nullif(charindex(char(7), delimited_value), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0), 8000) + 1 ), 0) + 1, 8000), 8000) as component7
from #x

Despite the requirement not to use a function I am posting a solution that uses one. This is an inline table valued function and is insanely fast. Short of using CLR you aren't going to find a fast splitter out there. You can find the article and the code here. http://www.sqlservercentral.com/articles/Tally+Table/72993/

If you don't like that one there are several other excellent choices here. https://sqlperformance.com/2012/07/t-sql-queries/split-strings

Using the Jeff Moden splitter (from the first link above) your code would be this simple.

select x.a
    , component1 = max(case when y.ItemNumber = 1 then y.Item end)
    , component2 = max(case when y.ItemNumber = 2 then y.Item end)
    , component3 = max(case when y.ItemNumber = 3 then y.Item end)
    , component4 = max(case when y.ItemNumber = 4 then y.Item end)
    , component5 = max(case when y.ItemNumber = 5 then y.Item end)
    , component6 = max(case when y.ItemNumber = 6 then y.Item end)
    , component7 = max(case when y.ItemNumber = 7 then y.Item end)
from #x x
cross apply dbo.DelimitedSplit8K(x.delimited_value, char(7)) y
group by x.a

I think the XML Approach would be easy and efficient solution here.

Example

Select A.a
      ,B.*
 From  #x A
 Cross Apply (
                Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
                      ,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
                      ,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
                      ,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
                      ,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
                      ,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
                      ,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
                From  (Select Cast('<x>' + replace(delimited_value,char(7),'</x><x>')+'</x>' as xml) as xDim) as A 
             ) B

Returns

a   Pos1    Pos2    Pos3    Pos4    Pos5    Pos6    Pos7
1   abc     NULL    NULL    NULL    NULL    NULL    NULL
2   defgh   ij      klmnop  NULL    NULL    NULL    NULL
3           NULL    NULL    NULL    NULL    NULL    NULL
4   qr      s       t       u       v       w       xyz
5   012             3               NULL    NULL    NULL
6                   4567            89      NULL    NULL

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM