简体   繁体   English

将数据从一行解析为CSV文件中的多行

[英]Parsing data from one row into multiple rows in CSV file

I am new to SSIS and need some help figuring out how to parse this data. 我是SSIS的新手,需要一些帮助弄清楚如何解析此数据。 The Course-level Learning Objectives need to be split into multiple rows and the data in the [] needs to be moved to it's own column. 课程级别的学习目标需要分为多行,并且[]中的数据需要移动到其自己的列中。 Any help would be greatly appreciated. 任何帮助将不胜感激。 The CSV file contains multiple records. CSV文件包含多个记录。 The example below is just one record. 下面的示例只是一个记录。

Current format of CSV file : CSV文件的当前格式

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment. ","   

Identify learn. [EXPLORE] 
Evaluate personal,  goals. [ACT] 
Utilize development. [EXPLORE] 
"

Format the file needs to be in 格式化文件需要在

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives,Type

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Identify learn.", [EXPLORE] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Evaluate goals.", [ACT] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Utilize dev.", [EXPLORE] 

在SSIS 2008中没有直接使用该文件的方法,但是可以在SSIS 2012中完成然后旋转数据,如果要在SSIS 2008中使用它,则应使用脚本任务来格式化文件,然后使用平面文件DF中的源代码,在脚本任务中,您应该使用filereader来实现它,有关拆分此文件的更多信息,请参见此链接http://sqlbisam.blogspot.com/2013/12/parsing-data-from-one-row-到-multiple.html

Basic steps: 基本步骤:

  1. Get the data into a table in SQL Server using SSIS 使用SSIS将数据获取到SQL Server中的表中
  2. Slay the "Course-level Learning Objectives" dragon 杀死“课程水平的学习目标”龙
  3. Unpivot the results 取消结果

Here is a code snippet for the "Nth" index function: 这是“第N个”索引函数的代码段:

    CREATE FUNCTION [dbo].[udf_NthIndex] 
                   (@Input     VARCHAR(8000), 
                    @Delimiter CHAR(1), 
                    @Ordinal   INT) 

    RETURNS INT 
    AS 

      BEGIN 

        DECLARE  @Pointer INT, 
                 @Last    INT, 
                 @Count   INT 

        SET @Pointer = 1 
        SET @Last = 0 
        SET @Count = 1 

        WHILE (2 > 1) 
          BEGIN 
            SET @Pointer = CHARINDEX(@Delimiter,@Input,@Pointer) 
            IF @Pointer = 0 
              BREAK 
            IF @Count = @Ordinal 

              BEGIN 
                SET @Last = @Pointer 
                BREAK 
              END 
            SET @Count = @Count + 1 
            SET @Pointer = @Pointer + 1 

          END 

        RETURN @Last 

      END

    GO
;

This method solves it using Common Table Expressions , the "Nth" index function, and UNPIVOT 此方法使用通用表表达式 ,“ Nth”索引函数和UNPIVOT来解决

WITH s1
    AS ( SELECT 'ABE 095' AS [Prefix/Code]
              , 'Keys to Academic Success' AS Name
              , '3.0' AS Credits
              , 'Basic .. assessment. ' AS Description
              , '   

Identify learn. [EXPLORE] 
Evaluate personal,  goals. [ACT] 
Utilize development. [EXPLORE] 
' AS [Course-level Learning Objectives]
       ) , s2
    AS ( SELECT [Prefix/Code]
              , Name
              , Credits
              , Description
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 2
                                ) + 2 AS Type1Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 3
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 2
                                                    ) + 0 AS Type1Length
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 3
                                ) + 2 AS Type2Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 4
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 3
                                                    ) + 0 AS Type2Length
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 4
                                ) + 2 AS Type3Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 5
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 4
                                                    ) + 0 AS Type3Length
           FROM s1
       ) , s3
    AS ( SELECT s2.[Prefix/Code]
              , s2.Name
              , s2.Credits
              , s2.Description
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type1Start , Type1Length
                                       )
                            )
                     )AS Type1_chunk
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type2Start , Type2Length
                                       )
                            )
                     )AS Type2_chunk
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type3Start , Type3Length
                                       )
                            )
                     )AS Type3_chunk
           FROM s1 , s2
       ) , unpivot1
    AS ( SELECT [Prefix/Code]
              , Name
              , Credits
              , Description
              , Type_chunk
           FROM( 
                 SELECT [Prefix/Code]
                      , Name
                      , Credits
                      , Description
                      , Type1_chunk
                      , Type2_chunk
                      , Type3_chunk
                   FROM s3
               )p UNPIVOT( Type_chunk FOR Type_descrip IN( Type1_chunk
                                                         , Type2_chunk
                                                         , Type3_chunk
                                                         )
                                                         )AS unpvt
       )
    SELECT [Prefix/Code]
         , Name
         , Credits
         , Description
         --, Type_chunk
         , LEFT( u.Type_chunk , -2 + dbo.udf_NthIndex( u.Type_chunk , '[' , 1
                                                     )
               )AS [Learning Objectives]
         , RIGHT( u.Type_chunk , 1 + LEN( u.Type_chunk
                                        ) - dbo.udf_NthIndex( u.Type_chunk , '[' , 1
                                                            )
                )AS Type
      FROM unpivot1 u;

If you are able to use Regular Expressions, you can save a bit of code. 如果能够使用正则表达式,则可以保存一些代码。 Using RegEx in SQL Server 2008 requires CLR. 在SQL Server 2008中使用RegEx需要CLR。 This book does a great job of showing you how to do that step by step. 本书在向您展示如何逐步执行方面做得很好。 The solution works good enough for a smallish number of "Type" values per course. 该解决方案对于每门课程很少的“类型”值足够好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM