简体   繁体   English

使用白名单正则表达式 php 清理字符串

[英]sanitize string using whitelist regex php

I want to sanitize a $string using the next white list:我想使用下一个白名单清理$string

It includes az, AZ,0-9 and some usual characters included on posts []=+-¿?¡!<>$%^&*'"()/#@*,.:;_| .它包括 az, AZ,0-9 和帖子中包含的一些常用字符[]=+-¿?¡!<>$%^&*'"()/#@*,.:;_|
As well spanish accents like á,é,í,ó,ú and ÁÉÍÓÚ以及西班牙口音,如á,é,í,ó,úÁÉÍÓÚ

WHITE LIST白名单

abcdefghijklmnñopqrstuvwxyzñáéíóúABCDEFGHIJKLMNÑOPQRSTUVWXYZÁÉÍÓÚ0123456789[]=+-¿?¡!<>$%^&*'"()/#@*,.:;_|

I want to sanitize this string我想清理这个字符串

 $string="//abcdefghijklmnñopqrstuvwxyzñáéíóúABCDEFGHIJKLMNÑOPQRSTUVWXYZÁÉÍÓÚ0123456789[]=+-¿?¡!<>$%^&*'()/#@*,.:;_| |||||||||| ] ¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶¸¹º»¼½ mmmmm onload onclick='' [ ? / < ~ # ` ! @ $ % ^ & * ( ) + = } | :  ; ' , > { space !#$%&'()*+,-./:;<=>?@[\]^_`{|}~ <html>sdsd</html> ** *`` `` ´´ {} {}[] ````... ;;,,´'¡'!!!!¿?ña ñaña ÑA á é´´ è ´ 8i ó ú à à` à è`ì`ò ù &  > < ksks < wksdsd '' \" \' <script>alert('hi')</script>";

I tried this regex but it doesnt work我试过这个正则表达式,但它不起作用

//$regex = '/[^\w\[\]\=\+\-\¿\?\¡\!\<\>\$\%\^\&\*\'\"\(\)\/\#\@\*\,\.\/\:\;\_\|]/i';
//preg_replace($regex, '', $string);

Does anyone has a clue how to sanitize thisstring according to the whitelist values?有没有人知道如何根据白名单值清理这个字符串?

If you known your white list characters use the white list in the regex instead of including the black list.如果您知道您的白名单字符,请在正则表达式中使用白名单,而不是包括黑名单。 The blacklist could be really big.黑名单可能很大。 Specially if the encoding something like UTF-8 or UTF-16特别是如果编码类似于 UTF-8 或 UTF-16

There is a lot of ways to do this.有很多方法可以做到这一点。 One could be to create a regex with capture groups of the desired range of posibilities (also include the spaces and new lines) and compose a new string with the groups.一种方法是创建一个具有所需可能性范围(还包括空格和新行)的捕获组的正则表达式,并用这些组组成一个新字符串。

Also take carefully that some of the characters could be reserved regex characters and need to be scaped.还要注意某些字符可能是保留的正则表达式字符并且需要转义。 Like "[ ? +"像“[?+”

You could test a regex like:你可以测试一个正则表达式,如:

$string ="Your test string";
$pattern= "([a-zA-Z0-9\[\]=\+\-\¿\?¡!<>$%\^&\*'\"\sñÑáéíóúÁÉÍÓÚ]+)";
preg_match_all($pattern, $string, $matches);
$newString =  join('', $matches);

This is only and simple example of how to apply the whilte list with the regex.这是如何使用正则表达式应用白名单的唯一简单示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM