简体   繁体   English

在 SSIS 中处理具有多种格式和不同记录集的平面文件

[英]Processing a flat file with multiple formats and different sets of records in SSIS

I am trying to process a Pipped separated flat file using SSIS.我正在尝试使用 SSIS 处理 Pipped 分隔的平面文件。 The file has two different types of records.该文件有两种不同类型的记录。 each record set has has different it's own header row and a trailer row.每个记录集都有不同的标题行和结尾行。 They need to go to two different table.他们需要去两张不同的桌子。

The trailer row has the row count for each record set while the header row contains column names for the records.尾部行包含每个记录集的行数,而标题行包含记录的列名称。 So in essence it's like two types of files but in one file.所以本质上它就像两种类型的文件,但在一个文件中。

I've tried several solutions including using a conditional split but I haven't been able to achieve this.我尝试了几种解决方案,包括使用条件拆分,但我无法实现。 I know you could do this using a script component and C# but I haven't been able to achieve this.我知道你可以使用脚本组件和 C# 来做到这一点,但我没能做到这一点。 I've attached an image to show file format.我附上了一张图片来显示文件格式。

This is what I have tried so far.这是我到目前为止所尝试的。

  1. I edited the flat file connection to ragged right so that the output comes in one column.我将平面文件连接编辑为参差不齐的右侧,以便输出在一列中。
  2. I then created a script component as source.然后我创建了一个脚本组件作为源。 The idea was to read the file line by line using stream reader, then create 4 output buffers.这个想法是使用流读取器逐行读取文件,然后创建 4 个输出缓冲区。 2 for column headers and 2 for the different detail rows then set the script to stop when it reaches the trailer row. 2 个用于列标题,2 个用于不同的详细信息行,然后将脚本设置为在到达尾行时停止。 My intention was to merge each header row the respective detail rows then save them to the relevant tables我的目的是将每个标题行与各自的详细信息行合并,然后将它们保存到相关表中
  3. I then used C# code which I got from my research.然后我使用了我从研究中得到的 C# 代码。 I picked this up from the Microsoft site.我是从 Microsoft 网站上找到的。

I used the Code Below:我使用了下面的代码:

public class ScriptMain : UserComponent
{
    private StreamReader textReader;
    private string RTWFile;

    public override void AcquireConnections(object Transaction)
    {
        IDTSConnectionManager100 connMgr = this.Connections.RTWCon;
        RTWFile = (string)connMgr.AcquireConnection(null);
    }

    public override void PreExecute()
    {
        base.PreExecute();
        textReader = new StreamReader(RTWFile);
    }

    public override void CreateNewOutputRows()
    {
        string nextLine;
        string[] columns;

        char[] delimiters;
        delimiters = "|".ToCharArray();

        nextLine = textReader.ReadLine();
        while (nextLine != null)
        {
            columns = nextLine.Split(delimiters);
            {
                HeadersBuffer.AddRow();
                HeadersBuffer.EmployeeNumber = columns[0];
                HeadersBuffer.LegacyStaffID = columns[1];
                HeadersBuffer.FirstName = columns[2];
                HeadersBuffer.LastName = columns[3];
                HeadersBuffer.PassportIssuingCountry = columns[4];
                HeadersBuffer.PassportType = columns[5];
                HeadersBuffer.PassportNumber = columns[6];
                HeadersBuffer.PassportIssuingAuthority =columns[7];           
                HeadersBuffer.PassportIssueDate = columns[8];
                HeadersBuffer.PassportExpirationDate = columns[9];
            }
            nextLine = textReader.ReadLine();
        }
    }

    public override void PostExecute()
    {
        base.PostExecute();
        textReader.Close();
    }
}

Image of the flat file format:平面文件格式的图像:

平面文件格式的图像

OutPutBuffers:输出缓冲区:

输出缓冲区

FlatFileConfiguration:平面文件配置:

平面文件配置

Do you need to extract the information from Trailer rows also?您是否还需要从 Trailer 行中提取信息?

Separating the file into two files is the cleanest way to do this.将文件分成两个文件是最干净的方法。 We can do this with a script task once we understand your requirements.一旦我们了解您的要求,我们就可以通过脚本任务来完成此任务。

UPDATE:更新:

Add Script Task and Provide FilePath as Read Variable添加脚本任务并提供 FilePath 作为读取变量

Edit Script task and add these to namespaces region at the top编辑脚本任务并将它们添加到顶部的命名空间区域

using System.IO;
using System.Text;
using System.Collections.Generic;

public void Main()
{
    try
    {
        String InputFilePath = Dts.Variables["User::FilePath"].Value.ToString();
        string InputFolder = Path.GetDirectoryName(InputFilePath);
        string TrailerLine = "TotalRow";
        bool FirstFile = true;
        string line;
        List<string> FirstFileLines, SecondFileLines;

        // Read the file and display it line by line.  
        System.IO.StreamReader file =
            new System.IO.StreamReader(InputFilePath);
        FirstFileLines = new List<string>();
        SecondFileLines = new List<string>();
        while ((line = file.ReadLine()) != null)
        {
            if (line.Contains(TrailerLine))
            {
                FirstFile = false;
                continue;
            }

            if (FirstFile) FirstFileLines.Add(line);
            else SecondFileLines.Add(line);
        }

        File.WriteAllLines(InputFolder + @"\FirstFile.txt", FirstFileLines.ToArray());
        File.WriteAllLines(InputFolder + @"\SecondFile.txt", SecondFileLines.ToArray());
        file.Close();
        Dts.TaskResult = (int)ScriptResults.Success;
    }
    catch (System.Exception ex)
    {
        MessageBox.Show(ex.Message.ToString());
        Dts.TaskResult = (int)ScriptResults.Failure;
    }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM