簡體   English   中英

Java URI正則表達式花費太長時間

[英]Java URI regexp takes too long

我的Java應用程序中有一個servlet過濾器,以確保用戶對文章和類別使用最新的URI。 問題是,根據探查器結果,此過濾器(自身)花費了大約40%的總請求時間(即使對於簡單的URI“ /”而言)(內部操作也不簡單,其動態網頁帶有巨大的菜單) ,文章排名等)。

public class NameFilter implements Filter {

    private ArticleServiceIface articleService;
    private CategoryServiceIface categoryService;
    private UrlRewriteServiceIface urlRewriteService;
    private Pattern pattern = Pattern.compile("^(?>.*?)/(article|category)/(\\d+)/(?>.*)$");

    public void init(FilterConfig filterConfig) throws ServletException {
        ApplicationContext ctx = WebApplicationContextUtils.getRequiredWebApplicationContext(filterConfig.getServletContext());
        articleService = (ArticleServiceIface) ctx.getBean("articleService");
        categoryService = (CategoryServiceIface) ctx.getBean("categoryService");
        urlRewriteService = (UrlRewriteServiceIface) ctx.getBean("urlRewriteService");
    }

    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        String uri = ((HttpServletRequest) request).getRequestURI();
        Matcher matcher = pattern.matcher(uri);
        String currUri;
        if (matcher.matches()) {
            if (matcher.group(1).equals("article")) {
                Long articleId = Long.valueOf(matcher.group(2));

                ArticleDTO a = articleService.getById(articleId);
                currUri = urlRewriteService.getUrl(a.getId());
            } else {
                Long categoryId = Long.valueOf(matcher.group(2));

                CategoryDTO c = categoryService.getById(categoryId);
                currUri = urlRewriteService.getCategoryUrl(c.getId());
            }
        } else { //does not match neighter article nor category
            chain.doFilter(request, response);
            return;
        }
        if (currUri.equals(uri)) {
            chain.doFilter(request, response);
        } else {
            HttpServletResponse res = (HttpServletResponse) response;
            res.setStatus(HttpServletResponse.SC_MOVED_PERMANENTLY);
            res.setHeader("Location", currUri);
            res.getWriter().close();
        }


    }

    public void destroy() {
    }
}

我花了幾個小時來調試和分析它,嘗試了許多不同的方式來編寫正則表達式,但是結果始終是相同的。

瓶頸似乎在matchs方法中,該方法被遞歸調用,並且由於某種原因,有時它會反復調用模式匹配(數千次)。

感謝您的任何建議。

編輯: 探查器結果 (對我來說似乎很奇怪...根據調試器,這應該是URI ==“ /”的解析)


EDIT2:當前正則表達式:

 private static Pattern pattern = Pattern.compile(".*?/(article|category)/(\\d+)/.*");

結果仍然相同。 我會嘗試用

  System.out.print(System.currTimeMillis - time)

EDIT3:結論:它可能是netbeans Profiler錯誤...

我已經嘗試過此代碼和URI“ /”

    long time = System.currentTimeMillis();
    if (matcher.matches()) {
        if (matcher.group(1).equals("article")) {
            Long articleId = Long.valueOf(matcher.group(2));

            ArticleDTO a = articleService.getById(articleId);
            currUri = urlRewriteService.getUrl(a.getId());
        } else {
            Long categoryId = Long.valueOf(matcher.group(2));

            CategoryDTO c = categoryService.getById(categoryId);
            currUri = urlRewriteService.getCategoryUrl(c.getId());
        }
    } else { //does not match neighter article nor category
        System.out.println(System.currentTimeMillis() - time);
        ....

輸出始終為0。因此在我看來,由於某種原因,netbeans profiler會在此方法上增加時間。

但是,感謝大家的幫助與合作,我了解到一些regex技巧。

實際上,您無需在模式中使用Lookbehinds。 以下代碼對我有效,並且速度很快:

long l = System.currentTimeMillis();
Pattern p = Pattern.compile("^.*?/(article|category)/(\\d+)/.*$");
Matcher m = p.matcher("/category/1012/Grafy");
System.out.println("Matches: " + m.matches());
System.out.println("Group1: " + m.group(1) + ", Group2: " + m.group(2));
System.out.println("Time taken: " + (System.currentTimeMillis()-l));

輸出值

Matches: true
Group1: category, Group2: 1012
Time taken: 0

編輯嘗試將match()的find()插入如下:

long l = System.currentTimeMillis();
p = Pattern.compile("/(article|category)/(\\d+)/");
m = p.matcher("/en/article/123/articleName");
System.out.println("Matches: " + m.find());
System.out.println("Group1: " + m.group(1) + ", Group2: " + m.group(2));
System.out.println("Time taken: " + (System.currentTimeMillis()-l));

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM