简体   繁体   English

CPP +正则表达式验证URL

[英]CPP + Regular Expression to Validate URL

I want to build a regular expression in c++{MFC} which validates the URL. 我想在c ++ {MFC}中构建一个正则表达式来验证URL。

The regular expression must satisfy following conditions. 正则表达式必须满足以下条件。

Valid URL:- http://cu-241.dell-tech.co.in/MyWebSite/ISAPIWEBSITE/Denypage.aspx/ http://www.google.com http://www.google.co.in 有效网址: - http://cu-241.dell-tech.co.in/MyWebSite/ISAPIWEBSITE/Denypage.aspx/ http://www.google.com http://www.google.co.in

Invalid URL:- 无效的网址:-

  1. http://cu-241.dell-tech.co.in/ \\MyWebSite/\\ISAPIWEBSITE/\\Denypage.aspx/ = Regx must check & invalid URL as '\\' character between "/\\MyWebSite/\\ISAPIWEBSITE/\\Denypage.aspx/" http://cu-241.dell-tech.co.in/ \\ MyWebSite / \\ ISAPIWEBSITE / \\ Denypage.aspx / = Regx必须检查&无效网址为“/ \\ MyWebSite / \\ ISAPIWEBSITE / \\ Denypage”之间的'\\'字符的.aspx /”

  2. http://cu-241.dell-tech.co.in//////MyWebSite/ISAPIWEBSITE/Denypage.aspx/ = Regx must check & invalidate URL due to multiple entries of "///////" in url. http://cu-241.dell-tech.co.in//////MyWebSite/ISAPIWEBSITE/Denypage.aspx/ =由于多次输入“///////”,Regx必须检查并使URL无效在网址中。

  3. http://news.google.co.in/%5Cnwshp?hl=en&tab=wn = Regex must check & invalidate URL for additional insertion of %5C & %2F character. http://news.google.co.in/%5Cnwshp?hl=en&tab=wn =正则表达式必须检查并使URL无效,以便额外插入%5C和%2F字符。

How can we develop a generic Regular Expression satisfying above condition. 我们如何开发满足上述条件的通用正则表达式。 Please, Help us by providing a regular expression that will handle above scenario's in CPP{MFC} 请通过提供一个正则表达式帮助我们,这个表达式将处理CPP中的上述场景{MFC}

Have you tried using the RFC 3986 suggestion? 您是否尝试过使用RFC 3986建议? If you're capable of using GCC-4.9 then you can go directly with <regex> . 如果您能够使用GCC-4.9,那么您可以直接使用<regex>

It states that with ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))? 它说明了^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))? you can get as submatches: 你可以得到子匹配:

  scheme    = $2
  authority = $4
  path      = $5
  query     = $7
  fragment  = $9

For example: 例如:

int main(int argc, char *argv[])
{
  std::string url (argv[1]);
  unsigned counter = 0;

  std::regex url_regex (
    R"(^(([^:\/?#]+):)?(//([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?)",
    std::regex::extended
  );
  std::smatch url_match_result;

  std::cout << "Checking: " << url << std::endl;

  if (std::regex_match(url, url_match_result, url_regex)) {
    for (const auto& res : url_match_result) {
      std::cout << counter++ << ": " << res << std::endl;
    }
  } else {
    std::cerr << "Malformed url." << std::endl;
  }

  return EXIT_SUCCESS;
}

Then: 然后:

./url-matcher http://localhost.com/path\?hue\=br\#cool

Checking: http://localhost.com/path?hue=br#cool
0: http://localhost.com/path?hue=br#cool
1: http:
2: http
3: //localhost.com
4: localhost.com
5: /path
6: ?hue=br
7: hue=br
8: #cool
9: cool

look at http://gskinner.com/RegExr/ , there is a community tab on the right where you find contributed regex's. 看看http://gskinner.com/RegExr/ ,右侧有一个社区选项卡,您可以在其中找到贡献的正则表达式。 There is a URI category, not sure you'll find exactly what you need but this is a good start 有一个URI类别,不确定你会找到你需要的,但这是一个良好的开端

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM