I have the following code to parse the parts of an FTP link:
Regex exp = new Regex(@"(?i)ftp:\/\/(?<user>\S+?):(?<passwd>\S+?)@(?<host>\S+?.\S+?.\S+?.\S+?)");
Match m = exp.Match(@"Link: ftp://username:password@host.sub.domain.tld<ftp://username:password@host.sub.domain.tld/>");
Console.WriteLine("Host = " + m.Groups["host"].Value);
Console.WriteLine("User = " + m.Groups["user"].Value);
Console.WriteLine("Pass = " + m.Groups["passwd"].Value);
Which produces the following output:
Host = host.su
User = username
Pass = password
Why is the host being truncated?
Parsing Uri's is already done in .NET. The syntax of URIs has too many edge cases and variations to just use a regex.
So use the inbuilt support:
var u = new Uri("ftp://username:password@host.sub.domain.tld");
var host = u.Host;
var ui = u.UserInfo.Split(':')
var user = ui[0];
var pwd = ui[1];
Because \\S
will match also the dot character and .
would match any character.
@"(?i)ftp:\/\/(?<user>\S+?):(?<passwd>\S+?)@(?<host>[^.\s]+\.[^.\s]+\.[^.\s]+\.\w+)"
Why?
(?<host>\S+?.\S+?.\S+?.\S+?)
\\S+?
- Matches the first charcter because of non-greediness. .
- Matches the second character, since an unescaped dot would match any character.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.