简体   繁体   中英

Parsing data from one row into multiple rows in CSV file

I am new to SSIS and need some help figuring out how to parse this data. The Course-level Learning Objectives need to be split into multiple rows and the data in the [] needs to be moved to it's own column. Any help would be greatly appreciated. The CSV file contains multiple records. The example below is just one record.

Current format of CSV file :

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment. ","   

Identify learn. [EXPLORE] 
Evaluate personal,  goals. [ACT] 
Utilize development. [EXPLORE] 
"

Format the file needs to be in

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives,Type

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Identify learn.", [EXPLORE] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Evaluate goals.", [ACT] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Utilize dev.", [EXPLORE] 

在SSIS 2008中没有直接使用该文件的方法,但是可以在SSIS 2012中完成然后旋转数据,如果要在SSIS 2008中使用它,则应使用脚本任务来格式化文件,然后使用平面文件DF中的源代码,在脚本任务中,您应该使用filereader来实现它,有关拆分此文件的更多信息,请参见此链接http://sqlbisam.blogspot.com/2013/12/parsing-data-from-one-row-到-multiple.html

Basic steps:

  1. Get the data into a table in SQL Server using SSIS
  2. Slay the "Course-level Learning Objectives" dragon
  3. Unpivot the results

Here is a code snippet for the "Nth" index function:

    CREATE FUNCTION [dbo].[udf_NthIndex] 
                   (@Input     VARCHAR(8000), 
                    @Delimiter CHAR(1), 
                    @Ordinal   INT) 

    RETURNS INT 
    AS 

      BEGIN 

        DECLARE  @Pointer INT, 
                 @Last    INT, 
                 @Count   INT 

        SET @Pointer = 1 
        SET @Last = 0 
        SET @Count = 1 

        WHILE (2 > 1) 
          BEGIN 
            SET @Pointer = CHARINDEX(@Delimiter,@Input,@Pointer) 
            IF @Pointer = 0 
              BREAK 
            IF @Count = @Ordinal 

              BEGIN 
                SET @Last = @Pointer 
                BREAK 
              END 
            SET @Count = @Count + 1 
            SET @Pointer = @Pointer + 1 

          END 

        RETURN @Last 

      END

    GO
;

This method solves it using Common Table Expressions , the "Nth" index function, and UNPIVOT

WITH s1
    AS ( SELECT 'ABE 095' AS [Prefix/Code]
              , 'Keys to Academic Success' AS Name
              , '3.0' AS Credits
              , 'Basic .. assessment. ' AS Description
              , '   

Identify learn. [EXPLORE] 
Evaluate personal,  goals. [ACT] 
Utilize development. [EXPLORE] 
' AS [Course-level Learning Objectives]
       ) , s2
    AS ( SELECT [Prefix/Code]
              , Name
              , Credits
              , Description
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 2
                                ) + 2 AS Type1Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 3
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 2
                                                    ) + 0 AS Type1Length
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 3
                                ) + 2 AS Type2Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 4
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 3
                                                    ) + 0 AS Type2Length
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 4
                                ) + 2 AS Type3Start
              , dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                           ) , 5
                                ) - dbo.udf_NthIndex( [Course-level Learning Objectives] , CHAR( 13
                                                                                               ) , 4
                                                    ) + 0 AS Type3Length
           FROM s1
       ) , s3
    AS ( SELECT s2.[Prefix/Code]
              , s2.Name
              , s2.Credits
              , s2.Description
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type1Start , Type1Length
                                       )
                            )
                     )AS Type1_chunk
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type2Start , Type2Length
                                       )
                            )
                     )AS Type2_chunk
              , RTRIM( LTRIM( SUBSTRING( s1.[Course-level Learning Objectives] , s2.Type3Start , Type3Length
                                       )
                            )
                     )AS Type3_chunk
           FROM s1 , s2
       ) , unpivot1
    AS ( SELECT [Prefix/Code]
              , Name
              , Credits
              , Description
              , Type_chunk
           FROM( 
                 SELECT [Prefix/Code]
                      , Name
                      , Credits
                      , Description
                      , Type1_chunk
                      , Type2_chunk
                      , Type3_chunk
                   FROM s3
               )p UNPIVOT( Type_chunk FOR Type_descrip IN( Type1_chunk
                                                         , Type2_chunk
                                                         , Type3_chunk
                                                         )
                                                         )AS unpvt
       )
    SELECT [Prefix/Code]
         , Name
         , Credits
         , Description
         --, Type_chunk
         , LEFT( u.Type_chunk , -2 + dbo.udf_NthIndex( u.Type_chunk , '[' , 1
                                                     )
               )AS [Learning Objectives]
         , RIGHT( u.Type_chunk , 1 + LEN( u.Type_chunk
                                        ) - dbo.udf_NthIndex( u.Type_chunk , '[' , 1
                                                            )
                )AS Type
      FROM unpivot1 u;

If you are able to use Regular Expressions, you can save a bit of code. Using RegEx in SQL Server 2008 requires CLR. This book does a great job of showing you how to do that step by step. The solution works good enough for a smallish number of "Type" values per course.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM