简体   繁体   English

SSIS:如何在C#中从管道定界的平面文件中读取,省略匹配并写入记录

[英]SSIS: How to read, omit by matching, and write records from pipe delimited flat file in C#

I am a beginner in SSIS and C# programming, just got a new job and want to make a mark. 我是SSIS和C#编程的初学者,刚刚找到一份新工作,并且想打个分数。 I previously created a package to convert a tab delimited flat file to a SQL Server table, but the flat file was prepared manually for conversion. 我之前创建了一个程序包,用于将制表符分隔的平面文件转换为SQL Server表,但是该平面文件是为转换而手动准备的。 Now I need to automate the conversion of the pipe delimited file, which was created like a report and has several heading and sub-heading rows. 现在,我需要自动进行管道定界文件的转换,该文件的创建类似于报告,并且具有多个标题行和子标题行。 The conversion from pipe to tab delimited is not a issue for me. 从管道到制表符分隔的转换对我来说不是问题。 However, I just can't find a way nor get any help online to understand how to read each record, determine it's content, and omit or write a record. 但是,我只是找不到一种方法,也无法在线获得任何帮助来了解如何读取每条记录,确定其内容以及省略或写入记录。

I put together the following SSIS C# script shown below coding from Main () only, but I am getting the following error, and I don't know why I am getting it. 我将下面显示的以下SSIS C#脚本仅从Main ()组合在一起,但出现以下错误,而且我不知道为什么得到它。 Can you help me out here? 你能帮我吗?

ERROR - (Error at Script Task1: The binary code for the script is not found. Please open the script in the designer by clicking Edit Script button and make sure it builds successfully. Error at Script Task 1: There were errors during task validation.) 错误-(脚本任务1发生错误:找不到脚本的二进制代码。请通过单击“编辑脚本”按钮在设计器中打开脚本,并确保脚本成功构建。脚本任务1发生错误:任务验证期间出错。 )

The script is supposed to: 该脚本应该:

1) Read each record in the pipe delimited flat file 1)读取管道分隔平面文件中的每个记录

2) check each line/record to determine if they contain the following values, and do not write records if they contain these values: 2)检查每行/记录以确定它们是否包含以下值,如果它们包含以下值,则不写记录:

• Spaces •空间

• Value - “Business Unit:” etc. •价值-“业务单位:”等

• Value - "Empl Id | Employee Name | Dept Id | Department | EE Home Phone | Emergency Contact Name | Primary | Telephone | Relationship" •值-“ Empl ID |雇员名称| Dept ID |部门| EE家庭电话|紧急联系人姓名|主要|电话|关系”

• The last value is the heading. •最后一个值是标题。 I want to write the 1st occurrence of this heading, but do not write any other occurrences of it afterwards. 我想写此标题的第一个匹配项,但此后不写任何其他匹配的项。

SCRIPT: 脚本:

public void Main()
{
  string SourcePath = Dts.Variables["User::Source_Path"].Value.ToString();
  string DestinationPath = Dts.Variables["User::Destination_Path"].Value.ToString();
  string Line = string.Empty;

  try
  {
    using (StreamReader sr = new StreamReader(SourcePath))
    using (StreamWriter ds = new StreamWriter(DestinationPath))
    {
      while ((Line = sr.ReadLine()) != null)
      { 
        // MessageBox.Show(Line);
        if (Line == " ")
          Console.WriteLine("Blank Line");
        else
          if (Line == "Business Unit: 069 - DEPT OF XXXX XXXXX")
            Console.WriteLine("069 Heading");
          else
            if (Line == "Business Unit: 071 - DEPT. OF YYYYY YYYYYYY")
              Console.WriteLine("071 Heading");
            else
              if (Line == "Empl Id | Employee Name | Dept Id | Department | EE Home Phone | Emergency Contact Name | Primary | Telephone | Relationship")
                Console.WriteLine("Main Heading");

        // write to destination file
        ds.WriteLine(Dts.Variables["User::Destination_Path"].Value);
      }
      // close the stream
      ds.Close();
      //string data = sr.ReadToEnd();
      //MessageBox.Show(data);
    }
    Dts.TaskResult = (int)ScriptResults.Success;
  }
  catch (Exception ex)
  {
    MessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK);
  }

The reason the file path is being printed is because you're calling StreamWriter.WriteLine() using the variable that holds the file path, as opposed using the text itself. 之所以要打印文件路径,是因为您要使用保存文件路径的变量来调用StreamWriter.WriteLine() ,而不是使用文本本身。 The file path will be used when initializing the StreamWriter and the text when the WriteLine() method is called. 初始化StreamWriter时将使用文件路径,而调用WriteLine()方法时将使用文本。 I'd also recommend storing the text you don't want to write in string variables as done below. 我还建议按照以下步骤将您不想编写的文本存储在字符串变量中。 As necessary, you can add additional strings that will be filtered out (ie the "TextToOmit" strings below). 如有必要,您可以添加其他将被过滤掉的字符串(即下面的“ TextToOmit”字符串)。 You may need to may a couple modifications for the exact strings you want to omit, but the code below will keep on the first header and remove the subsequent strings. 您可能需要对要忽略的确切字符串进行几处修改,但是下面的代码将保留在第一个标头上,并删除后续的字符串。 You can remove the Close() method as it isn't necessary since this will be closed upon exiting the using block. 您可以删除Close()方法,因为它是不必要的,因为它会在退出using块时关闭。 Lines with white-space are also filtered out by the String.IndexOf method, which returns -1 when the text is not found. String.IndexOf方法也过滤掉带有空格的行,当找不到文本时,该方法返回-1。 You mentioned concerns about getting a direct match, you can use the IndexOf method with the StringComparision.CurrentCultureIgnoreCase parameter to match the first occurrence without case-sensitivity. 您提到了有关获得直接匹配的问题,可以将IndexOf方法与StringComparision.CurrentCultureIgnoreCase参数一起使用以匹配第一个匹配项,而不区分大小写。 However with this approach you will want to be sure that the text you're omitting doesn't occur in any records that need to be kept, an example of this is below in initial code snippet. 但是,使用这种方法时,您将要确保要忽略的文本不会出现在任何需要保留的记录中,下面的示例在初始代码段中。 I'm assuming that you want to write each record on a new line, which is why the Environment.NewLine property is added when building the output text. 我假设您想将每条记录写在新行上,这就是为什么在构建输出文本时添加Environment.NewLine属性的原因。

string sourcePath = Dts.Variables["User::Source_Path"].Value.ToString();
string destinationPath = Dts.Variables["User::Destination_Path"].Value.ToString();
string line = string.Empty;
string outputText = string.Empty;
string headerText = "YourHeaderLine";
string secondTextToOmit = "TextThatNeedsToBeOmitted";
string thirdTextToOmit = "TextThatNeedsToBeOmitted";
int headerCount = 0;
try
{
    using (StreamReader sr = new StreamReader(sourcePath))
    {
        while ((line = sr.ReadLine()) != null)
        {
            //only write first occurance
            if (line == headerText && headerCount == 0)
            {
                outputText = outputText + line + Environment.NewLine;
                headerCount++;
            }
            else
             //store text in variables to do checks all in same if statement
             //IndexOf looks for while space
             if (line.IndexOf(' ') < 0 && line != headerText
                && line != secondTextToOmit && line != thirdTextToOmit)
            {
                outputText = outputText + line + Environment.NewLine;
            }
            //initialize StreamWriter using file path
            using (StreamWriter writer = new StreamWriter(destinationPath))
            {
                //write the string using filtered text
                writer.WriteLine(outputText);
            }
        }
    }
}

Match Without Case-Sensitivity Example: 不区分大小写的匹配示例:

line.IndexOf(thirdTextToOmit, 0, StringComparison.CurrentCultureIgnoreCase) < 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM