[英]Match part of Url pattern with or withour trailing slash
I have to match a pattern with the URL.我必须将模式与 URL 匹配。 I want the pattern to match the domain, and don't care about if it ends in a trailing slash or if it has querystring params, or any subdomains I want only to accept the protocols http or https.
我希望该模式与域匹配,并且不关心它是否以斜杠结尾或是否具有查询字符串参数,或者我只想接受协议 http 或 https 的任何子域。
Here is what I tried:这是我尝试过的:
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
using Newtonsoft.Json;
public class Program
{
public static void Main()
{
List<string> inputs = new List<string>{
"https://dotnetfiddle.net/UA6bCb"
,"http://www.test.ch/de-ch/apps/weve?anlassId=236601"
,"https://www.test.ch/de-ch/apps/weve?anlassId=236601"
,"http://test.ch/de-ch/apps/weve?anlassId=236601"
,"https://test.ch/de-ch/apps/weve?anlassId=236601"
,"https://test.chn/de-ch/apps/weve?anlassId=236601"
,"https://www.test.chn/de-ch/apps/weve?anlassId=236601"
,"https://test.ch/de-ch/"
,"https://test.ch/de-ch"
,"https://test.ch/"
,"https://test.ch"
,"https:test.ch"
};
Test(inputs);
}
public static void Test(List<string> inputs)
{
var regexString= @"http(s)?://?([\w-]+\.)?test.ch(/[\w- ;,./?%&=]*)?";
foreach(var input in inputs){
var matches = Regex.Match(input,regexString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
if(matches.Success){
Console.WriteLine("{0} matches {1}", input, regexString);
}
else{
Console.WriteLine("NO MATCH for {0}", input);
}
}
}
}
This returns这返回
NO MATCH: https://dotnetfiddle.net/UA6bCb
Match: http://www.test.ch/de-ch/apps/weve?anlassId=236601
Match: https://www.test.ch/de-ch/apps/weve?anlassId=236601
Match: http://test.ch/de-ch/apps/weve?anlassId=236601
Match: https://test.ch/de-ch/apps/weve?anlassId=236601
Match: https://test.chn/de-ch/apps/weve?anlassId=236601
Match: https://www.test.chn/de-ch/apps/weve?anlassId=236601
Match: https://test.ch/de-ch/
Match: https://test.ch/de-ch
Match: https://test.ch/
Match: https://test.ch
NO MATCH: https:test.ch
The problem is that this solution matches https://test.问题是这个解决方案匹配https://test。 chn /de-ch/apps/weve?anlassId=236601 and https://www.test.
chn /de-ch/apps/weve?anlassId=236601和https://www.test。 chn /de-ch/apps/weve?anlassId=236601
chn /de-ch/apps/weve?anlassId=236601
This should be false because the domain ends in chn.这应该是错误的,因为域以 chn 结尾。
I haven't been able to get the right regex.我无法获得正确的正则表达式。
Thanks for the help.谢谢您的帮助。
If you just want to exclude test.chn
then you can use a negative lookbehind to ensure ch
is not followed by n
:如果您只想排除
test.chn
,那么您可以使用否定的lookbehind来确保ch
后面没有n
:
"http(s)?://?([\w-]+\.)?test.ch(?!n)(/[\w- ;,./?%&=]*)?"
I added the part (?!n)
.我添加了部分
(?!n)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.