简体   繁体   English

如何验证和修复URL中无效的斜杠数量?

[英]How to validate and fix invalid number of slashes in URL?

I have a code that is not working as expected. 我有一个无法正常工作的代码。 The idea is to match a group with slashes from url. 想法是将一个组与url中的斜杠进行匹配。 The number of slashes should be 1 or more. 斜线数应为1或更大。 The algorithm should replace whatever number of slashes with only two. 该算法应仅用两个斜杠替换任何数量的斜杠。 How to fix the code? 如何修复代码?

HttpURLConverter HttpURLConverter

public class HttpURLConverter {

    final private String UrlPattern = "((([A-Za-z]{3,9}:(?:\\/\\/)?)(?:[\\-;:&=\\+\\$,\\w]+@)?[A-Za-z0-9\\.\\-]+|(?:www\\.|[\\-;:&=\\+\\$,\\w]+@)[A-Za-z0-9\\.\\-]+)((?:\\/[\\+~%\\/\\.\\w\\-_]*)?\\??(?:[\\-\\+=&;%@\\.\\w_]*)#?(?:[\\.\\!\\/\\\\\\w]*))?)";

    URL validateURL(URL url) throws MalformedURLException {
        URL validURL = null;
        if(!Pattern.matches(UrlPattern, url.toString())){
            if(Pattern.matches("(https?|ftp|file):.*", url.toString())){
                Matcher matcher = Pattern.compile("(https?|ftp|file):(\\/)*([A-za-z0-9\\.\\-?#_]+)([A-za-z0-9\\.\\-?#_\\/]{0,})", Pattern.CASE_INSENSITIVE).matcher(url.toString());

                List<String> allMatches = new ArrayList<String>();
                while (matcher.find()) {
                       allMatches.add(matcher.group());
                }
                if(allMatches.size() > 1){
                    System.out.println(allMatches.get(2));
                    allMatches.set(2, "//"); // replace any number of slashes with only two
                    validURL = new URL(allMatches.toString());

                }else{
                    throw new RuntimeException("Expected slashes after URL shema definition but found none.");
                }
                System.out.println(matcher.group(1));
                System.out.println(matcher.group(2));
                System.out.println(matcher.group(3));
                    System.out.println(matcher.group(4));

            }else{
                throw new RuntimeException("Given url is not valid. URL shema is not detected");
            }
        }
        return validURL;
    }

}

TEST 测试

@Test
    public void testHttpURLConverter2() throws MalformedURLException{
        assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:///google.com")));
    }
@Test
    public void testHttpURLConverter2() throws MalformedURLException{
        assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:/google.com")));
    }

Besides the solution of @Dishi Jain... Have a closer look at your test cases. 除了@Dishi Jain的解决方案...仔细查看您的测试用例。 You try to compare an object of type String with an object of type URL (= return type of the method validateURL ). 您尝试将String类型的对象与URL类型的对象(= validateURL方法的返回类型)进行比较。 So even if the method is now correctly implemented. 因此,即使现在正确实现了该方法。 Your test cases will always fail (as a String -Object is never a URL -Object). 您的测试用例将始终失败(因为String -Object永远不是URL -Object)。

So do something like: 因此,请执行以下操作:

@Test
public void testHttpURLConverter2() {
    assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:/google.com")).toString());
}

or maybe 或者可能

@Test
public void testHttpURLConverter2() {
    assertEquals(new URL("http://google.com"), new HttpURLConverter().validateURL(new URL("http:/google.com")));
}

This is the optimal solution I could come up with. 这是我能想到的最佳解决方案。 You need to keep checks and further handling for 100% success results. 您需要进行检查和进一步处理,以取得100%的成功结果。 This method will print the validated URL for both the test inputs. 此方法将为两个测试输入打印经过验证的URL。

import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class HttpURLConverter {

    final private String UrlPattern = "((([A-Za-z]{3,9}:(?:\\/\\/)?)(?:[\\-;:&=\\+\\$,\\w]+@)?[A-Za-z0-9\\.\\-]+|(?:www\\.|[\\-;:&=\\+\\$,\\w]+@)[A-Za-z0-9\\.\\-]+)((?:\\/[\\+~%\\/\\.\\w\\-_]*)?\\??(?:[\\-\\+=&;%@\\.\\w_]*)#?(?:[\\.\\!\\/\\\\\\w]*))?)";

    URL validateURL(URL url) throws MalformedURLException {
        //System.out.println(url);
        URL validURL = null;
        if (!Pattern.matches(UrlPattern, url.toString())) {
            if (Pattern.matches("(https?|ftp|file):.*", url.toString())) {
                Matcher matcher = Pattern
                        .compile("(https?|ftp|file):(\\/)*([A-za-z0-9\\.\\-?#_]+)([A-za-z0-9\\.\\-?#_\\/]{0,})", Pattern.CASE_INSENSITIVE)
                        .matcher(url.toString());

                List<String> allMatches = new ArrayList<String>();
                while (matcher.find()) {
                    allMatches.add(matcher.group());
                }

                for (String str : allMatches) {
                    String regex = "(\\/)+";
                    str = str.replaceAll(regex, "//");
                    validURL = new URL(str);
                    System.out.println("Validated URL : " + validURL);
                }

            } else {
                throw new RuntimeException("Given url is not valid. URL shema is not detected");
            }
        }

        return validURL;
    }

    public static void main(String[] args) throws MalformedURLException {
        new HttpURLConverter().validateURL(new URL("http:////google.com"));
    }

    }

You get following output : 您得到以下输出:

http:////google.com
Validated URL : http://google.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM