简体   繁体   English

filter_var与preg_match

[英]filter_var versus preg_match

Morning all 大家早

I'm converting a site that I'm working on to be compliant with the latest version of PHP, so I'm going through and replacing all instances of ereg with their non-depreciated equivalent. 我正在转换一个我正在努力的网站以符合最新版本的PHP,因此我将使用非折旧等价物替换所有ereg实例。 However I was told about a handy built-in function with PHP called filter_var. 然而,有人告诉我一个方便的内置函数PHP,名为filter_var。

What my question is, is would it make sense to go with filter_var over preg_match? 我的问题是,在filter_var上使用preg_match是否有意义? As in is there a performance boost or any other benefits to choosing one over the other, and if so what are they? 因为选择一个而不是另一个有性能提升或任何其他好处,如果是这样,它们是什么?

First of all, the PHP Manual page on filtering: https://php.net/manual/en/book.filter.php 首先 ,关于过滤的PHP手册页面: https//php.net/manual/en/book.filter.php

Second , context is key. 其次 ,背景是关键。 Generally speaking, filter functions are designed to use external input (scalars or arrays), or internal input . 一般来说,过滤器功能设计为使用外部输入 (标量或数组)或内部输入 External input comes from sources like an an HTTP request / PHP engine, or a form submission. 外部输入来自HTTP请求/ PHP引擎或表单提交等来源。

Filter functions with the filter_input prefix allow you to bypass $_SERVER, $_COOKIE, $_POST, and $_GET superglobals entirely. 使用filter_input前缀的过滤器函数允许您完全绕过 $ _SERVER,$ _COOKIE,$ _POST和$ _GET superglobals。 Although you generally specify "where" you want the data from, filter functions do not explicitly utilize $_POST, $_GET, $_COOKIE, and $_SERVER. 虽然您通常会指定数据来自“where”,但过滤器函数并未明确使用 $ _POST,$ _GET,$ _COOKIE和$ _SERVER。 Changes you make to the variable/array elements will not show up in $_GET, $_POST, or $_SERVER, so using filter this way is a paradigm shift and may change the flow of your application significantly. 您对变量/数组元素所做的更改不会显示在$ _GET,$ _POST或$ _SERVER中,因此以这种方式使用过滤器是一种范例转换,可能会显着改变应用程序的流量。 In other words, you have to track the external input yourself. 换句话说,您必须自己跟踪外部输入。 I do this for initial sanitizing (stripping, replacing, altering, etc...) of external input. 我这样做是为了对外部输入进行初始消毒(剥离,替换,改变等)。 I no longer use $_POST, $_GET, or $_SERVER at all. 我根本不再使用$ _POST,$ _GET或$ _SERVER。 Although, I do still use $_FILES. 虽然,我仍然使用$ _FILES。

Functions prefixed with filter_var are for filtering any general array that already exists within your program. filter_var为前缀的函数用于过滤程序中已存在的任何常规数组。 I use this after having used filter_input . 我在使用filter_input后使用它。 There are many filters you can use in both cases, but your question is about performance . 在这两种情况下都可以使用许多过滤器,但您的问题与性能有关。

If you chose to use the FILTER_VALIDATE_REGEXP filter with any of the filtering functions, I cannot imagine this indirect approach being more efficient than directly using preg_match() . 如果您选择将FILTER_VALIDATE_REGEXP过滤器与任何过滤函数一起使用,我无法想象这种间接方法比直接使用preg_match()更有效。 As far as the other filters go, if they are simply 'n' number of methods/functions removed from a regular expression call, I cannot see an improvement in efficiency there either . 就其他过滤器而言,如果它们只是从正则表达式调用中删除了许多方法/函数,那么我也看不到效率的提高

I see the filter functions as something that were designed to help improve consistency for filtering tasks that happen across many applications. 我认为过滤器功能是为了帮助提高在许多应用程序中发生的过滤任务的一致性而设计的。 They are probably not designed to be more efficient , but they are definitely designed to be more accessible than regular expressions (though I am very good with regular expressions). 它们可能不是为了提高效率而设计的,但它们的设计绝对比正则表达式更易于访问 (尽管我对正则表达式非常熟悉)。 I prefer having direct knowledge of what's happening, but some people don't or could care less. 我更喜欢直接了解正在发生的事情,但有些人却不关心或不关心。 However, the filter functions open the door to filtering strings to those who don't understand regular expressions and other basic web application security processes. 但是,过滤器函数为那些不了解正则表达式和其他基本Web应用程序安全过程的人打开过滤字符串的大门。

One can certainly live without using the filter functions, though. 但是,当然可以不使用过滤功能。

What's more, I use the filter functions in conjunction with my own sanitizer and validator classes. 更重要的是,我将过滤器功能与我自己的清洁剂和验证器类结合使用。 So, I'm not asking PHP to think for me, I'm just using it to augment what I already know how to do (just in case their functions get something I miss). 所以,我不是要求PHP为我思考,我只是用它来增强我已经知道的怎么做(以防万一他们的功能得到了我想念的东西)。 Defense in depth. 防御深度。

In summary, your best bet is simply to use preg_match() , unless you intend on changing the flow ( filter_input functions) of input into your application. 总之,您最好的选择就是使用preg_match() ,除非您打算将输入的flowfilter_input函数) 更改为您的应用程序。 Even then, there won't be a performance boost, but you can bypass $_SERVER, $_POST, and $_GET. 即使这样,也不会有性能提升,但你可以绕过$ _SERVER,$ _POST和$ _GET。 Also, you can take advantage of simpler, structured, consistent, filtering functionality with the ability to use a callback function ( FILTER_CALLBACK ) to call custom, in house, methods/functions (which I do as well). 此外,您可以利用更简单,结构化,一致的过滤功能,并能够使用回调函数( FILTER_CALLBACK )来调用自定义,内部,方法/函数(我也这样做)。 Also, you can still use your own regular expressions with the filter functions using the FILTER_VALIDATE_REGEXP filter, but again, I see no reason to believe that the performance of your application will improve if you do. 此外,您仍然可以使用FILTER_VALIDATE_REGEXP过滤器将自己的正则表达式与过滤器函数一起使用,但同样,我认为没有理由相信如果您这样做,应用程序的性能将会提高。 Maintainability? 可维护性? Maybe. 也许。 It depends on the person writing the code. 这取决于编写代码的人。

filter_var — Filters a variable with a specified filter filter_var - 使用指定的过滤器过滤变量
preg_match — Perform a regular expression match preg_match - 执行正则表达式匹配

I guess use could use filter_var to filter variables but as a replacement for preg_match I don't think is a good idea for upgrading from ereg as filter_var doesn't use regex and you would have to rewrite a lot of the functionality/logic to do this. 我想use可以使用filter_var来过滤变量但是作为preg_match的替代我不认为从ereg升级是个好主意,因为filter_var不使用正则表达式而你必须重写很多功能/逻辑来做这个。

Switching over to use filter_var() would be a great idea actually. 切换到使用filter_var()实际上是一个好主意。 You wouldn't be able to use your existing regular expressions, however you WOULD be able to eliminate them entirely. 您将无法使用现有的正则表达式,但是您可以完全消除它们。 Often, the regex we use in our apps are simply used for simple validation s and filtering, which is exactly what the filter_var() function is intended for. 通常,我们在应用程序中使用的正则表达式仅用于简单验证和过滤,这正是filter_var()函数的用途。

For example, in your code, you may already have: 例如,在您的代码中,您可能已经拥有:

if (eregi('\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b', $_POST['email'])) {
    echo "valid";
}

This could be replaced by the prettier version (not relying on custom regular expressions): 这可以被更漂亮的版本所取代(不依赖于自定义正则表达式):

if (filter_var($_POST['email'], FILTER_VALIDATE_EMAIL)) {
    echo "valid";
}

The filter_var() function also has the ability to sanitize out characters which aren't needed by the particular data you're examining, and would return the cleaned string (instead of a boolean): filter_var()函数还能够清理您正在检查的特定数据不需要的字符,并返回已清理的字符串(而不是布尔值):

$clean = filter_var($_POST['email'], FILTER_SANITIZE_EMAIL);

This kind of usage with filter_var() would replace ereg_replace() type functions. filter_var()这种用法将取代ereg_replace()类型函数。

However, for the simplest of upgrades, you can just "prefix" the ereg*() family of functions with a 'p', which makes them PCRE compliant (and therefore no longer deprecated in PHP 5.3+). 但是,对于最简单的升级,您可以使用'p'为ereg *()系列函数添加“前缀”,这使得它们符合PCRE(因此不再在PHP 5.3+中弃用)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM