[英]Remove a set of characters using Regex including the space character doesn't work
目前,我正在使用StringBuilder
從string
刪除字符列表,如下所示
char[] charArray = {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
// Remove special characters that aren't allowed
var sanitizedAddress = new StringBuilder();
foreach (var character in emailAddress.ToCharArray())
{
if (Array.IndexOf(charArray, character) < 0)
sanitizedAddress.Append(character);
}
我試圖如下使用正則Regex
var invalidCharacters = Regex.Escape(@"%&=?{}|<>;:,\"()[]\\/*+\s");
emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");
您可以為此使用字符集 [...]
:
var invalidCharacters = "[" + Regex.Escape(@"%&=?{}|<>;:,""()\*/+") + @"\]\[\s]";
emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");
一些注意事項:
""
而不是“ \\"
\\s
已經是一個轉義的序列,因此Regex.Escape
將呈現\\\\s
,這不是您想要的 Regex.Escape
似乎無法正確轉義]
字符-這就是為什么要單獨添加它 您可以嘗試使用Linq (為了借助Where
過濾掉不需要的字符)而不是正則表達式 :
using System.Linq;
...
// Hash set is faster on Contains operation than array - O(1) vs. O(N)
HashSet<char> toRemove = new HashSet<char>() {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
string emailAddress = ...
string emailAddress = string.Concat(emailAddress
.Where(c => !toRemove.Contains(c)));
您可以在Where
添加更多
string emailAddress = string.Concat(emailAddress
.Where(c => !toRemove.Contains(c))
.Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well
如果您堅持使用正則表達式,則必須構建模式,例如:
char[] charArray = {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
// Joined with | ("or" in regular expressions) all the characters (escaped!)
string pattern = string.Join("|", charArray
.Select(c => Regex.Escape(c.ToString())));
然后您可以Replace
:
string emailAddress = Regex.Replace(emailAddress, pattern, "");
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.