简体   繁体   中英

How to split a string into 2 strings using RegEx?

I need a regEx to retrieve the street of a string and the streetnumber. Let's consider that the streetname starts from the beginning until there is a whitespace followed by a number

example:

Original string: 'Jan van Rijswijcklaan 123' Result should be: 'Jan van Rijswijcklaan' as the streetname and '123' as the streetnumber.

any help is appreciated.

UPDATE

I was able to get the streetname and number, but sometimes I had streetnumbers like '123b a1' then the code failed in defining the streetnumber. As result for the streetnumber was only 'a1' instead of '123b a1'.

So at the moment I'm dealing with 2 scenarios:

  1. When streetname contains only alphabetic characters and number contains only digits - like 'Jan van Rijswijcklaan 123'
  2. When streetname contains only alphabetic characters and number contains alphanumeric characters - like 'Jan van Rijswijcklaan 123b a1'

Here is the code I tried:

string street = Regex.Match(streetWithNum, @"^[^0-9]*").Value + ";";
string number = Regex.Match(streetWithNum, @"\w\d*\w?\d*$").Value + ";";

Use positive lookahead pattern to search spliting condition:

var s = "Jan van Rijswijcklaan 124";
var result = Regex.Split(s, @"\s(?=\d)");

Console.WriteLine("street name: {0}", result[0]);
Console.WriteLine("street number: {0}", result[1]);

prints:

street name: Jan van Rijswijcklaan
street number: 124

note: use Int32.TryParse to convert street number from string to int, if you need to

I'm not a fan of regex, do you notice that?

IEnumerable<string> nameParts = "Jan van Rijswijcklaan 124".Split()
    .TakeWhile(word => !word.All(Char.IsDigit));
string name = string.Join(" ", nameParts);

DEMO

If you want to take both, the street-name and the number:

string[] words = "Jan van Rijswijcklaan 124".Split();
var streetNamePart = words.TakeWhile(w => !w.All(Char.IsDigit));
var streetNumberPart = words.SkipWhile(w => !w.All(Char.IsDigit));
Console.WriteLine("street-name: {0}", string.Join(" ", streetNamePart));
Console.WriteLine("street-number: {0}", string.Join(" ", streetNumberPart));

Here a non-regex solution also;

string str = "Jan van Rijswijcklaan 124";
var array = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);

string streetname = "";
string streetnumber = "";
foreach (var item in array)
{
     if (Char.IsNumber(item[0]))
         streetnumber += item;
      else
         streetname += item + " ";
}

Console.WriteLine(streetname.TrimEnd());
Console.WriteLine(streetnumber);

Output will be;

Jan van Rijswijcklaan
124

Here a DEMO .

比用lookahead修复@Ilya_Ivanov的答案:

var result = Regex.Split(s, @"\s(?=\d)");

this should do:

Regex r = new Regex(@"(.+?) (\d+)$");
Match m = r.Match("Jan van Rijswijcklaan 124");
String street = m.Groups[1].Value;
String number = m.Groups[2].Value;

I typed this from memory, don't blame me for typos :)

edit: the '$' at the end of the regex string makes sure the number match to occur only at the end of the input string.

edit 2: just removed the typos and tested the code, it works now.

edit 3: the expression can be read as: Collect as many characters as you can get into group 1 without being greedy (.+?) but leave a sequence of digits at the end of the string, after a whitespace, for group 2 (\\d+)$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM