简体繁体 English

正则表达式匹配额外的字符

[英]Regex matching extra characters

原文 2017-12-05 18:56:48 3 1 regex

using: this tool to evaluate my expression 使用：此工具评估我的表情

My test string: "Little" Timmy (tim) McGraw 我的测试字符串：“小”蒂米（蒂姆）麦格劳

my regex: ^[()"]|.["()] 我的正则表达式：^ [（）“] | .. [”（）]

It looks like I'm properly catching the characters I want but my matches are including whatever character comes just before the match. 看起来我正确地捕获了想要的字符，但我的比赛包括比赛前刚出现的任何字符。 I'm not sure what, or if anything, I'm doing wrong to be catching the preceding characters like that? 我不确定是什么，或者如果有的话，捕捉这样的前面的字符做错了吗？ The goal is to capture characters we don't want in the name field of one of our systems. 目标是捕获我们系统之一的名称字段中不需要的字符。

1 个解决方案

Brief 简要

Your current regex ^[()"]|.["()] says the following: 您当前的正则表达式^[()"]|.["()]表示以下内容：

^[()"]|.["()] Match either of the following ^[()"]|.["()]匹配以下任一
- ^[()"] Match the following ^[()"]符合以下条件
  - ^ Assert position at the start of the line ^在行首处声明位置
  - [()"] Match any character present in the list ()" [()"]匹配列表中的任何字符()"
- .["()] Match the following .["()]符合以下条件
  - . Match any character (this is the issue you were having) 匹配任何字符（这是您遇到的问题）
  - ["()] Match any character present in the list "() ["()]匹配列表"()存在的任何字符

Code 码

You can actually shorten your regex to just [()"] . 实际上，您可以将正则表达式缩短为[()"] 。

Ultimately, however, it would be much easier to create a negated set that determines which characters are valid rather than those that are invalid. 但是，最终，创建一个求反集来确定哪些字符有效而不是无效字符将容易得多。 This approach would get you something like [^\\w ] . 这种方法将使您获得[^\\w ]类的东西。 This means match anything not present in the set. 这意味着匹配集合中不存在的任何内容。 So match any non-word and non-space characters (in your sample string this will match the symbols ()" since they are not in the set). 因此，请匹配所有非单词和非空格字符（在示例字符串中，这将匹配符号()"因为它们不在集合中）。