匹配 Url 模式的一部分，带或不带斜杠

Question

I have to match a pattern with the URL.我必须将模式与 URL 匹配。 I want the pattern to match the domain, and don't care about if it ends in a trailing slash or if it has querystring params, or any subdomains I want only to accept the protocols http or https.我希望该模式与域匹配，并且不关心它是否以斜杠结尾或是否具有查询字符串参数，或者我只想接受协议 http 或 https 的任何子域。

Here is what I tried:这是我尝试过的：

using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;   
using Newtonsoft.Json;
public class Program
{
    public static void Main()
    {
        List<string>  inputs = new List<string>{
            "https://dotnetfiddle.net/UA6bCb"
        ,"http://www.test.ch/de-ch/apps/weve?anlassId=236601"
        ,"https://www.test.ch/de-ch/apps/weve?anlassId=236601"
        ,"http://test.ch/de-ch/apps/weve?anlassId=236601"
        ,"https://test.ch/de-ch/apps/weve?anlassId=236601"
                ,"https://test.chn/de-ch/apps/weve?anlassId=236601"
                ,"https://www.test.chn/de-ch/apps/weve?anlassId=236601"
                ,"https://test.ch/de-ch/"
                ,"https://test.ch/de-ch"
                ,"https://test.ch/"
                ,"https://test.ch"
                ,"https:test.ch"
        };
    
        Test(inputs);
        
    }

    public static void Test(List<string> inputs)
    {
        var regexString=  @"http(s)?://?([\w-]+\.)?test.ch(/[\w- ;,./?%&=]*)?";
        foreach(var input in inputs){
        var matches = Regex.Match(input,regexString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
            
            if(matches.Success){
                Console.WriteLine("{0} matches {1}", input, regexString);
            }
            else{
                    Console.WriteLine("NO MATCH for {0}", input);
            }
        
        
        }
    }
}

This returns这返回

NO MATCH: https://dotnetfiddle.net/UA6bCb
Match: http://www.test.ch/de-ch/apps/weve?anlassId=236601
Match: https://www.test.ch/de-ch/apps/weve?anlassId=236601
Match: http://test.ch/de-ch/apps/weve?anlassId=236601
Match: https://test.ch/de-ch/apps/weve?anlassId=236601
Match: https://test.chn/de-ch/apps/weve?anlassId=236601
Match: https://www.test.chn/de-ch/apps/weve?anlassId=236601
Match: https://test.ch/de-ch/
Match: https://test.ch/de-ch
Match: https://test.ch/
Match: https://test.ch
NO MATCH: https:test.ch

The problem is that this solution matches https://test.问题是这个解决方案匹配https://test。 chn /de-ch/apps/weve?anlassId=236601 and https://www.test. chn /de-ch/apps/weve?anlassId=236601和https://www.test。 chn /de-ch/apps/weve?anlassId=236601 chn /de-ch/apps/weve?anlassId=236601

This should be false because the domain ends in chn.这应该是错误的，因为域以 chn 结尾。

I haven't been able to get the right regex.我无法获得正确的正则表达式。

Thanks for the help.谢谢您的帮助。

Answer 1

If you just want to exclude test.chn then you can use a negative lookbehind to ensure ch is not followed by n :如果您只想排除test.chn ，那么您可以使用否定的lookbehind来确保ch后面没有n ：

"http(s)?://?([\w-]+\.)?test.ch(?!n)(/[\w- ;,./?%&=]*)?"

I added the part (?!n) .我添加了部分(?!n) 。

匹配 Url 模式的一部分，带或不带斜杠

问题描述

1 个解决方案

解决方案1
0 2021-06-17 18:54:57

匹配 Url 模式的一部分，带或不带斜杠

问题描述

1 个解决方案

解决方案1 0 2021-06-17 18:54:57

解决方案1
0 2021-06-17 18:54:57