简体   繁体   中英

Conditional Split in SSIS - SQL

I'm quite new to SQL Databases, but I'm trying to add a Conditional Split in my Data Flow between my Flat File Source and OLE DB Database to exclude records containing some special characters such as ø and ¿ and ¡ on the [title] column. Those are causing errors when creating a table and therefore I want those records to be split from my table. How can I create a conditional split for this?

As a bonus: Is there a way to only filter in a conditional split the rows that contain numbers from 0-9 and letters from a-zA-Z so that all rows with "special" symbols are filtered out automatically?

A conditional split works by determining whether a condition is true or false. So, if you can write a rule that evaluates to true or false, and you can multiple rules to address assorted business needs, then you can properly shunt rows into different pathways.

How do I do that?

I always advocate that people add new columns to their data flows to handle this stuff. It's the only way you're going to have to debug when a condition comes up that you think should have been handled but wasn't.

Whether you create a column called IsTitleOnlyAlphaNumeric or IsTitleInternational is really up to you. General programming rule is you go for the common/probable case. Since the ASCII universe is 127 characters max, 255 for extended ASCII, I'd advocate the former. Otherwise, you're going to play whack-a-mole as the next file has umlats or a thorn in it.

Typically, we would add a new column through a Derived Column Transformation which means you're working with the SSIS expression language. However, in this case the expression does not have the ability to gracefully * identify whether the string is good or not. Instead, you'll want to use the .NET library for this heavy lifting. That's the Script Component and you'll have it operate in the Transformation mode (default).

Add a new column of type boolean IsTitleOnlyAlphaNumeric and crib the regular expression from check alphanumeric characters in string in c#

The relevant bit of the OnRowProcessed (name approximate) would look like

Row.IsTitleOnlyAlphaNumeric = isAlphaNumeric(Row.Title);

As rows flow through, that will be evaluated for each one and you'll see whether it meets the criteria or not. Depending on your data, you might need a check for NULL before you call that method.

How I shouldn't do that

* You could abuse the daylights out of the REPLACE function and test the allowable length of an expression by doing something like creating a new column called StrippedTitle and we are going to replace everything allowable character with an empty string. If the length of the trimmed final string is not zero, then there's something bad in there.

REPLACE(REPLACE(REPLACE([Title], "A", ""), "B", ""), "C", "") ..., "a", ""), "b", "") ..., "9", "")

where ... implies you've continued the pattern. Yes, you'll have to replace upper and lower cased characters. ASCIITable.com or similar will be your friend.

That will be a new column. So add a second Derived Column component to calculate whether it's empty - again, easier to debug. IsTitleOnlyAlphaNumeric

LEN(RTRIM(StrippedTitle)) == 0

Terrible approach but the number of questions I answer where people later clarify "I cannot use script" is decidedly non-zero.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM