简体   繁体   中英

Lowercase the second match in a combination of words using Regex.Replace

In setting the last name of a person (I know this is a terrible job), I'm looking to lowercase the second match in a combination of any of the following words: Van, Den, Der, In, de, het. And repeat this pattern if it happens again after a '-'(combined familiy names).

Wanted results:
Van Den Broek => Van den Broek
Derksen-van 't schip => Derksen-Van 't Schip
In Het Lid-Van De Boer => In het Lid-Van de Boer

I've tried capitalizing the first letters and lower case after ' using the code below. However for creating the above results with Regex is still a bridge to far for me now.

var formattedLastName = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(lastName); 
formattedLastName = Regex.Replace(formattedLastName, @"('\w\b)", (Match match) => match.ToString().ToLower());

You can achieve your expected output using

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Globalization;

public class Test
{
    public static void Main()
    {
        var strings = new List<string> { "Van Den Broek", "Derksen-van 't schip", "In Het Lid-Van De Boer"};
        var textInfo = new CultureInfo("en-US", false).TextInfo;
        var pattern = new Regex(@"\b(Van|Den|Der|In|de|het)\b(?:\s+(\w+))?", RegexOptions.Compiled|RegexOptions.IgnoreCase);
        foreach (var s in strings)
            Console.WriteLine(pattern.Replace(s, m => textInfo.ToTitleCase(m.Groups[1].Value) + 
               (m.Groups[2].Success ? $" {m.Groups[2].Value.ToLower()}" : "")));
    }
}

See the online demo yiedling

Van den Broek
Derksen-Van 't schip
In het Lid-Van de Boer

The \b(Van|Den|Der|In|de|het)\b(?:\s+(\w+))? regex matches a word from the Van , Den , Der , In , de and het list capturing it into Group 1, and then an optional sequence of one or more whitespaces and then any word captured into Group 2.

The match is replaced with Group 1 turned to title case (note the use of System.Globalization.ToTitleCase ) and if Group 2 matched, a space and Group 2 value turned to lower case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM