Regular expression to parse FTP link string

Question

I have the following code to parse the parts of an FTP link:

Regex exp = new Regex(@"(?i)ftp:\/\/(?<user>\S+?):(?<passwd>\S+?)@(?<host>\S+?.\S+?.\S+?.\S+?)");
Match m = exp.Match(@"Link: ftp://username:password@host.sub.domain.tld<ftp://username:password@host.sub.domain.tld/>");

Console.WriteLine("Host = " + m.Groups["host"].Value);
Console.WriteLine("User = " + m.Groups["user"].Value);
Console.WriteLine("Pass = " + m.Groups["passwd"].Value);

Which produces the following output:

Host = host.su
User = username
Pass = password

Why is the host being truncated?

Answer 1

Parsing Uri's is already done in .NET. The syntax of URIs has too many edge cases and variations to just use a regex.

So use the inbuilt support:

var u = new Uri("ftp://username:password@host.sub.domain.tld");

var host = u.Host;
var ui = u.UserInfo.Split(':')
var user = ui[0];
var pwd = ui[1];

Answer 2

Because \\S will match also the dot character and . would match any character.

@"(?i)ftp:\/\/(?<user>\S+?):(?<passwd>\S+?)@(?<host>[^.\s]+\.[^.\s]+\.[^.\s]+\.\w+)"

DEMO

Why?

(?<host>\S+?.\S+?.\S+?.\S+?)

\\S+? - Matches the first charcter because of non-greediness.
. - Matches the second character, since an unescaped dot would match any character.
Likewise it matches only first 7 chars in the host part.

Regular expression to parse FTP link string

Question

2 answers

solution1
5 2015-06-15 08:04:29

solution2
1 ACCPTED 2015-06-15 08:01:17

Regular expression to parse FTP link string

Question

2 answers

solution1 5 2015-06-15 08:04:29

solution2 1 ACCPTED 2015-06-15 08:01:17

solution1
5 2015-06-15 08:04:29

solution2
1 ACCPTED 2015-06-15 08:01:17