简体   繁体   English

在T-SQL中将单个字段值拆分为多个固定长度的列值

[英]Split A Single Field Value Into Multiple Fixed-Length Column Values in T-SQL

I've looked at about 15 different answers on SO but haven't found this exact situation yet. 我在SO上看了大约15个不同的答案,但还没有找到这个确切的情况。 I'm doing a custom data export and need to export to a data file that will be imported into an older system that needs the data in a specific length/format. 我正在进行自定义数据导出,需要导出到一个数据文件,该文件将导入到需要特定长度/格式数据的旧系统中。 I have a "MEMO" column that can have a large amount of text in it. 我有一个“备忘录”栏,里面可以包含大量文字。 I need to select that value and split it into multiple columns with a FIXED length of 75 chars. 我需要选择该值并将其拆分为多个列,其中FIXED长度为75个字符。 For instance, if I have a row with a message that is 185 chars, I need to split that into 3 new columns of 75 chars, MEMO1, MEMO2, MEMO3, with the remaining space in MEMO3 being filled with spaces to equal the 75 chars. 例如,如果我有一行带有185个字符的消息,我需要将其分成3个新的75个字符列,MEMO1,MEMO2,MEMO3,其中MEMO3中的剩余空格填充空格以等于75个字符。 The other catch, I can only use up to 18 75-char columns to dump the data into. 另一个问题是,我最多只能使用18个75-char列来转储数据。 If it's longer than 1350 (18x75) chars, the rest gets truncated. 如果它长于1350(18x75)个字符,则其余字符将被截断。

I tried this approach, but it doesn't take the total number of new memo columns into consideration. 我尝试了这种方法,但并未考虑新备忘录列的总数。 I need some way to iterate over NUMBEROFMEMOS and only select the necessary amount of new MEMO columns, but apparently you can't do a WHILE in a select. 我需要一些方法来迭代NUMBEROFMEMOS并且只选择必要数量的新MEMO列,但显然你不能在select中做一个WHILE。

SELECT FIRSTNAME, 
       LASTNAME, 
       DOB, 
       CEILING(LEN(NOTETEXT) / 75.0) as NUMBEROFMEMOS,
       SUBSTRING(NOTETEXT, 1, 75) as MEMOLINE1,
       SUBSTRING(NOTETEXT, 76, 149) as MEMOLINE2,
       SUBSTRING(NOTETEXT, 150, 224) as MEMOLINE3,
       etc. etc. etc
FROM CUSTOMER

I'm a long-time application dev who is trying to get more involved in the DB side of things. 我是一个长期的应用程序开发人员,他试图更多地参与数据库方面的工作。 If I were in the C# world, I would just create a method to do a for loop up to NUMBEROFMEMOS and output the data that way. 如果我在C#世界,我只会创建一个方法来执行for循环到NUMBEROFMEMOS并以这种方式输出数据。 I don't think that works here though. 我觉得这不适用于此。 Thanks in advance! 提前致谢!

As you are .net developer I guess it will be easy for you to write a .net function that you can use in your T-SQL code. 因为你是.net开发人员,我想你可以很容易地编写一个可以在你的T-SQL代码中使用的.net函数。 In order to write SQL CLR functions check this answer (I have used one of the links to implement SQL CLR regex function. 为了编写SQL CLR函数,请检查这个答案 (我使用了其中一个链接来实现SQL CLR正则表达式函数。


Let's say you need to split the values by 4-length chunks and show maximum 6 of them: 假设您需要将值拆分为4个长度的块,并显示最多6个:

DECLARE @DataSouce TABLE
(
    [RecordID] TINYINT IDENTITY(1,1) PRIMARY KEY
   ,[RecordData] NVARCHAR(MAX)
);

INSERT INTO @DataSouce ([RecordData])
VALUES ('test some test goes here')
      ,('some numbers go here - 1111122222233333344444444445');


SELECT DS.[RecordID]
      ,RM.[MatchID]
      ,RM.[CaptureValue]
FROM @DataSouce DS
CROSS APPLY [dbo].[fn_Utils_RegexMatches] ([RecordData], '.{1,4}') RM;

在此输入图像描述

Now the data is split. 现在数据被拆分了。 Let's pivot it and show only 6 of the chunks: 让我们pivot它并仅显示6个块:

SELECT *
FROM
(
    SELECT DS.[RecordID]
          ,RM.[MatchID]
          ,RM.[CaptureValue]
    FROM @DataSouce DS
    CROSS APPLY [dbo].[fn_Utils_RegexMatches] ([RecordData], '.{1,4}') RM
) DS
PIVOT
(
    MAX([CaptureValue]) FOR [MatchID] IN ([0], [1], [2], [3], [4], [5], [6])
) PVT;

在此输入图像描述

Here I use regex function to split the data and PIVOT to create columns and exclude some of the chunks. 在这里,我使用regex函数来分割数据和PIVOT来创建列并排除一些块。 You can now insert the data in table in order to materialized it and then export it. 您现在可以在表中插入数据以实现它,然后将其导出。 You can implement such function using the link above or create your own function doing something you need. 您可以使用上面的链接实现这样的功能,或者创建自己的功能来做你需要的事情。

You can use a dynamic SQL. 您可以使用动态SQL。 Here you are an example you can use to solve your problem: 在这里,您可以使用一个示例来解决您的问题:

declare @text nvarchar(max) = N'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
declare @len int = 75;

declare @NUMBEROFMEMOS int = CEILING(LEN(@text) / @len);

declare @query nvarchar(max) = N'select ';

declare @loop int = 0;
declare @start int = 1;
declare @memoline int = 1;

while @loop <= @NUMBEROFMEMOS begin
    if @loop > 0 begin
        set @query += N', ';
    end

   set @query += N'substring(''' + @text + N''', ' + cast(@start as nvarchar(max)) + N', ' + cast(@len as nvarchar(max)) + N') as MEMOLINE' + cast(@memoline as  nvarchar(max));

   set @start += @len
   set @loop += 1;
   set @memoline += 1;
end

execute sp_sqlexec @query;

Here is a cte which will normalize your data. 这是一个将数据标准化的cte。 From there you can pivot 从那里你可以转动

Declare @Customer table (FirstName varchar(50),LastName varchar(50),DOB Date,Memo varchar(max))
Insert into @Customer values 
('John','Doe'  ,'1964-07-29','I''ve looked at about 15 different answers on SO but haven''t found this exact situation yet. I''m doing a custom data export and need to export to a data file that will be imported into an older system that needs the data in a specific length/format. I have a "MEMO" column that can have a large amount of text in it. I need to select that value and split it into multiple columns with a FIXED length of 75 chars. For instance, if I have a row with a message that is 185 chars, I need to split that into 3 new columns of 75 chars, MEMO1, MEMO2, MEMO3, with the remaining space in MEMO3 being filled with spaces to equal the 75 chars. The other catch, I can only use up to 18 75-char columns to dump the data into. If it''s longer than 1350 (18x75) chars, the rest gets truncated.'),
('Jane','Smith','1972-03-21','I''m a long-time application dev who is trying to get more involved in the DB side of things. If I were in the C# world, I would just create a method to do a for loop up to NUMBEROFMEMOS and output the data that way. I don''t think that works here though. Thanks in advance!')

Declare @MaxLen int = 75

;with cteBase as (
    Select FirstName,LastName,DOB,Row=1,Memo=substring(Memo,1,@MaxLen) from @Customer
    Union All
    Select h.FirstName,h.LastName,h.DOB,Row=cteBase.Row+1,Memo=substring(h.Memo,((cteBase.Row+0)*@MaxLen)+1,@MaxLen) FROM @Customer h INNER JOIN cteBase ON h.FirstName = cteBase.FirstName and h.LastName = cteBase.LastName where substring(h.Memo,((cteBase.Row+0)*@MaxLen)+1,@MaxLen)<>''
)
--Select * from cteBase Order by LastName,Row
Select FirstName,LastName,DOB
      ,Memo01=max(case when Row=1 then Memo else null end)
      ,Memo02=max(case when Row=2 then Memo else null end)
      ,Memo03=max(case when Row=3 then Memo else null end)
      ,Memo04=max(case when Row=4 then Memo else null end)
      ,Memo05=max(case when Row=5 then Memo else null end)
      ,Memo06=max(case when Row=6 then Memo else null end)
      ,Memo07=max(case when Row=7 then Memo else null end)
      ,Memo08=max(case when Row=8 then Memo else null end)
      ,Memo09=max(case when Row=9 then Memo else null end)
      ,Memo10=max(case when Row=10 then Memo else null end)
      ,Memo11=max(case when Row=11 then Memo else null end)
      ,Memo12=max(case when Row=12 then Memo else null end)
      ,Memo13=max(case when Row=13 then Memo else null end)
      ,Memo14=max(case when Row=14 then Memo else null end)
      ,Memo15=max(case when Row=15 then Memo else null end)
      ,Memo16=max(case when Row=16 then Memo else null end)
      ,Memo17=max(case when Row=17 then Memo else null end)
      ,Memo18=max(case when Row=18 then Memo else null end)
  from cteBase
  Group By FirstName,LastName,DOB
 Order by LastName

The CTE Returns CTE返回

FirstName   LastName    DOB         Row Memo
John        Doe         1964-07-29  1   I've looked at about 15 different answers on SO but haven't found this exac
John        Doe         1964-07-29  2   t situation yet. I'm doing a custom data export and need to export to a dat
John        Doe         1964-07-29  3   a file that will be imported into an older system that needs the data in a 
John        Doe         1964-07-29  4   specific length/format. I have a "MEMO" column that can have a large amount
John        Doe         1964-07-29  5    of text in it. I need to select that value and split it into multiple colu
John        Doe         1964-07-29  6   mns with a FIXED length of 75 chars. For instance, if I have a row with a m
John        Doe         1964-07-29  7   essage that is 185 chars, I need to split that into 3 new columns of 75 cha
John        Doe         1964-07-29  8   rs, MEMO1, MEMO2, MEMO3, with the remaining space in MEMO3 being filled wit
John        Doe         1964-07-29  9   h spaces to equal the 75 chars. The other catch, I can only use up to 18 75
John        Doe         1964-07-29  10  -char columns to dump the data into. If it's longer than 1350 (18x75) chars
John        Doe         1964-07-29  11  , the rest gets truncated.
Jane        Smith       1972-03-21  1   I'm a long-time application dev who is trying to get more involved in the D
Jane        Smith       1972-03-21  2   B side of things. If I were in the C# world, I would just create a method t
Jane        Smith       1972-03-21  3   o do a for loop up to NUMBEROFMEMOS and output the data that way. I don't t
Jane        Smith       1972-03-21  4   hink that works here though. Thanks in advance!

Making sure all your resultant fields are the same length...... 确保所有结果字段的长度相同......

DECLARE @NoteText nVARCHAR(1350);
DECLARE @newFieldLength int = 3;  --Yours wull be 75

--GET ORIGINAL TEXT
SET @NoteText = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; --SET FROM YOURS SOURCE TABLE I GUESS

--MAKE SURE LAST ONE PADS TO THE REQUIRED LENGTH SO ALL FIELDS ARE THE SAME LENGTH
DECLARE @mod int = @newFieldLength - ( LEN(@NoteText) % @newFieldLength );
--SELECT @mod;
WHILE @Mod > 0
BEGIN
  SET @NoteText = @NoteText + ' ';
  SET @mod = @mod - 1;
END

DECLARE @NoOfFields INT;
SELECT @NoOfFields = CEILING(LEN(@NoteText) / @newFieldLength ) + 1;

DECLARE @Loop INT = 0;
DECLARE @dynSQL nVarchar(MAX) = 'SELECT FIRSTNAME, LASTNAME, DOB, ' + CONVERT(nvarchar(4), @NoOfFields) + ' as NUMBEROFMEMOS, '; 
DECLARE @pos INT = 1;
WHILE @Loop < @NoOfFields
BEGIN  
  IF @Loop > 0
  BEGIN
    SET @Pos = (@Loop * @newFieldLength) + 1;
  END;
  SET @dynSQL = @dynSQL + 'SUBSTRING(@NoteText, ' + CONVERT(nvarchar(2), @pos) + ', '  + CONVERT(nvarchar(2), @newFieldLength) +  ') as MEMOLINE_' + CONVERT(nvarchar(2), @loop + 1) + ', ';

  SET @Loop = @Loop + 1;
END

SET @dynSQL = @dynSQL + 'FROM CUSTOMER';
SET @dynSQL = REPLACE( @dynSQL, ', FROM CUSTOMER', ' FROM CUSTOMER ');

--RUN TEH RESULTANT SQL
EXEC @dynSQL

If you divide every row of data in a different number of column you will need to create an INSERT statament for every row. 如果将每行数据划分为不同数量的列,则需要为每一行创建一个INSERT语句。
Instead you can always generate all the 18 memo column and do a bulk insert 相反,您始终可以生成所有18个备忘录列并进行批量插入

INSERT INTO [OtherServer.OtherDB.user.table]
SELECT FIRSTNAME, 
       LASTNAME, 
       DOB, 
       LEFT(SUBSTRING(NOTETEXT, 1, 75) + SPACE(75), 75) as MEMOLINE1,
       LEFT(SUBSTRING(NOTETEXT, 76, 150) + SPACE(75), 75) as MEMOLINE2,
       LEFT(SUBSTRING(NOTETEXT, 151, 225) + SPACE(75), 75) as MEMOLINE3,
       ...
       LEFT(SUBSTRING(NOTETEXT, 1276, 1350) + SPACE(75), 75) as MEMOLINE18,
FROM [myServer.myDB.myUser.CUSTOMER]

If there are a lot of lines to export you can work in chunk 如果有很多行要导出,你可以在块中工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM