[英]Writing a C function using regular expression that can validate URL, IPv4 address, IPv6 address and FQDN
While the below C function does a good job to validate any combination of URL/FQDN but it fails to validate IPv4 addresses and Shorthand notation of IPv6 and certain other IPv6 format addresses. 尽管下面的C函数可以很好地验证URL / FQDN的任何组合,但是它无法验证IPv4地址以及IPv6和某些其他IPv6格式地址的简写形式。
Can the below regex be improvised to validate IPv4 addresses and IPv6 addresses? 可以立即使用以下正则表达式来验证IPv4地址和IPv6地址吗?
int validateURLPhase2(char *url)
{
int status;
regex_t re;
char *regexp = "^((ftp|http|https)://)?([a-z0-9]([-a-z0-9]*[a-z0-9])?\\.)|([0-9].[0-9].[0-9].[0-9])|(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))+((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)|(c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]|(g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)|(j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]|(m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)|(p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]|(t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw])$";
if ( regcomp(&re, regexp, REG_EXTENDED|REG_NOSUB|REG_ICASE) != 0 )
{
printf( "Regex has invalidated FQDN 1\n");
return -1;
}
status = regexec(&re, url, (size_t) 0, NULL, 0);
regfree(&re);
if ( status != 0 )
{
printf("Regex has invalidated FQDN 2\n");
return -1;
}
return 0;
}
Valid URL format that ideally should be accepted but was failed: http://[2001::1]/abc Regex has invalidated FQDN 2 validation failed 理想情况下应接受但有效的有效URL格式: http:// [2001 :: 1] / abc正则表达式使FQDN 2验证无效
Invalid URL format that ideally should be rejected but was success: http://10.192.1 validation success 无效的URL格式,理想情况下应被拒绝,但可以成功: http://10.192.1验证成功
Other cases passed: http://10.2.1.1/abc http://www.example.com/abc 其他通过的案例: http : //10.2.1.1/abc http://www.example.com/abc
The part of your regexp that matches numeric addresses only allows a single digit in each component. 正则表达式中与数字地址匹配的部分在每个组件中仅允许一个数字。 It also doesn't escape the .
它也无法逃脱.
, so it's matching anything. ,因此它可以匹配任何内容。 It should be: 它应该是:
([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})
Note that this will allow invalid IPs like 123.456.789.0
. 请注意,这将允许使用123.456.789.0
类的无效IP。 It just checks that each number is 1-3 digits, not that it's between 0
and 255
. 它只是检查每个数字是1-3位数字,而不是在0
到255
之间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.