简体   繁体   中英

String.Split variation in C#

I want to use the Split function on a string but keep the delimiting sequence as the first characters in each element of the string array. I am using this function to split HTML on every instance of a URL so I can run regex patterns on the URLs on a website. Is there any overloads of the split function to do this? or do I have to write my own function?

Thanks!

There is no built-in method for doing that. If you are splitting on a single pattern though, this can be coded out with the following

public IEnumerable<string> SplitAndKeepPrefix(this string source, string delimeter) {
  return SplitAndKeepPrefix(source, delimeter, StringSplitOptions.None);
}

public IEnumerable<string> SplitAndKeepPrefix(this string source, string delimeter, StringSplitOptions options ) {
  var split = source.Split(delimeter, options);
  return split.Take(1).Concat(split.Skip(1).Select(x => delimeter + x));
}

string result = htmlStr.SplitAndKeepPrefix("<a");

EDIT

Updated to not prefix every string :)

    public static string[] SplitAndKeepDelimiters(this string Original, string[] Delimeters, StringSplitOptions Options)
    {
        var strings = EnumSplitAndKeepDelimiters(Original, Delimeters);

        if (Options == StringSplitOptions.RemoveEmptyEntries)
        {
            return strings.Where((s) => s.Length != 0).ToArray();
        }
        else
        {
            return strings.ToArray();
        }
    }

    static IEnumerable<string> EnumSplitAndKeepDelimiters(this string Original, string[] Delimeters)
    {
        int currIndex = 0;

        while (currIndex < Original.Length)
        {
            var delimiterIndex = Delimeters.Select((d) => new { Source = d, Index = Original.IndexOf(d, currIndex) })
                .Where((d) => (d.Index != -1) && (d.Source != string.Empty) )
                .OrderBy((d) => d.Index)
                .FirstOrDefault();
        int nextIndex = (delimiterIndex != null ) ? delimiterIndex.Index + delimiterIndex.Source.Length : Original.Length;
            yield return Original.Substring(currIndex, nextIndex - currIndex);
            currIndex = nextIndex;
        }
    }

As far as I know this is not possible with the default Split method. You could write an extension method to solve your problem. Or simply iterate through the string [] and place the delimiter in front of each string.

I would go for the extension method :)

The answer is no you'll have to roll your own version.

Information on the String.Split API can be found on MSDN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM