简体   繁体   中英

Regex split by last index of

I have SQL CLR function based on .net regex function in order to split values by regular expression. In one of the cases, I am using the function to split a value by | . The issue is I have found that one of the values has double || . Since, I am sure that the second value (right value) is a number, I know that the second | is part of the first value (left value).

I have:

慂||2215

and that should be split to:

慂|
2215

I am splitting using this expression [|] . I think that in order to make it work, I need to use Zero-width negative look ahead assertion. but when I do split by (?![|])[|] I get:

慂||2215

and if I try with look behind - (?<![|])[|] I get:

慂
|2215

but I need the pipe to be part of the first value.

Could anyone assist me on this? Looking for only regex solution as not being able to change the application right now.


Here is the function if anyone need it:

/// <summary>
///     Splits an input string into an array of substrings at the positions defined by a regular expression pattern.
///     Index of each value is returned.
/// </summary>
/// <param name="sqlInput">The source material</param>
/// <param name="sqlPattern">How to parse the source material</param>
/// <returns></returns>
[SqlFunction(FillRowMethodName = "FillRowForSplitWithOrder")]
public static IEnumerable SplitWithOrder(SqlString sqlInput, SqlString sqlPattern)
{
    string[] substrings;
    List<Tuple<SqlInt64, SqlString>> values = new List<Tuple<SqlInt64, SqlString>>(); ;

    if (sqlInput.IsNull || sqlPattern.IsNull)
    {
        substrings = new string[0];
    }
    else
    {
        substrings = Regex.Split(sqlInput.Value, sqlPattern.Value);
    }

    for (int index = 0; index < substrings.Length; index++)
    {
        values.Add(new Tuple<SqlInt64, SqlString>(new SqlInt64(index), new SqlString(substrings[index])));
    }

    return values;
}

You should use a negative lookahead here rather than a lookbehind

[|](?![|])

See the regex demo

Details

  • [|] - matches a | char
  • (?![|]) - a negative lookahead that requires no | char immediately to the right of the current location.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM