简体   繁体   English

SSIS-平面文件源中的固定列数

[英]SSIS - Fixed number of columns in flat file source

I have a number of text files in a directory which have a set number of columns [6] separated by tabs. 我在目录中有许多文本文件,这些文件具有由标签分隔的固定数量的列[6]。 I read this into an SSIS package using a 'Flat File Source' block. 我使用“平面文件源”块将其读入SSIS包中。 If a file has more columns than the required number or if data is missing from any of the columns, I want to reject this file. 如果文件中的列数超过了所需的数目,或者任何列中的数据均丢失,我想拒绝该文件。

I have done some testing with various sample files. 我已经对各种示例文件进行了一些测试。 Whenever I add additional columns, the program accepts these files. 每当我添加其他列时,程序都会接受这些文件。 It throws an error when there are less columns which is good. 当较少的列是好的时,它将引发错误。

But, is there a way of specifying that the file must have a certain amount of columns and that data must be present in each column? 但是,是否有一种方法可以指定文件必须具有一定数量的列,并且每列中必须存在数据?

I don't have much experience with SSIS so I would appreciate any suggestions. 我在SSIS方面没有太多经验,所以我将不胜感激。

Thanks 谢谢

I would use a Script Task to do this. 我将使用脚本任务来执行此操作。

You can use System.IO.StreamReader to open the file and read your header row, and then perform whatever validation you need on the resulting string. 您可以使用System.IO.StreamReader打开文件并读取标题行,然后对结果字符串执行所需的任何验证。

I would also create a Boolean variable in the SSIS package, called something like 'FileIsValid', to which I would write (from the Script Task) True if the conditions are met, and False if they aren't. 我还将在SSIS包中创建一个布尔变量,称为“ FileIsValid”,如果满足条件,我将向其写入(通过脚本任务)如果为True,则为False。 I would then use this to direct the package flow using precedence constraints. 然后,我将使用优先约束使用它来指导程序包流。

Something like this: 像这样:

public void Main()
{
    System.IO.StreamReader reader = null;

    try
    {
        Dts.Variables["User::FileIsValid"].Value = false;

        reader = new System.IO.StreamReader(Dts.Variables["User::Filepath"].Value.ToString());

        string header = reader.ReadLine();

        if (header.Trim() == "Column1\tColumn2\tColumn3\tColumn4\tColumn5\tColumn6")
            Dts.Variables["User::FileIsValid"].Value = true;

        reader.Close();
        reader.Dispose();

        Dts.TaskResult = (int)ScriptResults.Success;
    }
    catch
    {
        if (reader != null)
        {
            reader.Close();
            reader.Dispose();
        }

        throw;
    }
}

With regards to checking there is data in all columns, does this need to be for every row? 关于检查所有列中是否都有数据,是否需要每一行?

You could continue reading the lines with StreamReader and use regular expressions to check for something like this. 您可以继续使用StreamReader阅读这些行,并使用正则表达式检查类似这样的内容。

Expanding on Chris Mack: 扩展Chris Mack:

If files do not have headers you can do a count. 如果文件没有标题,则可以进行计数。

char[] delim = new char[] {'\t'};
if(header.Split(delim).Length() == 5)
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM