简体   繁体   English

正则表达式以匹配引号中没有的单个单词/字符集

[英]Regex to match single words/character sets that aren't in quotes

I'm looking to write a regex (C#) that will match words that aren't surrounded by quotes. 我正在寻找写正则表达式(C#),以匹配不被引号引起来的单词。 An example input string would be: 输入字符串的示例为:

dbo.test line_length "quoted words" notquoted dbo.test line_length“被引用的单词”未引用

And this needs to match 这需要匹配

dbo.test dbo.test

line_length line_length

nonquoted 未引用

So 3 separate matches and "quoted words" is not matched. 因此,3个单独的匹配项与“被引用的单词”不匹配。 The quoted phrase could be anywhere in the input...beginning, middle, end, etc. 引用的短语可以在输入中的任何位置...开头,中间,结尾等。

I haven't been able to come up with a regex that matches words not in quotes where there could be a space in the quotes...I've been able to match something like: hello "world" and only get hello. 我还无法提出一个正则表达式来匹配不在引号中的单词,否则引号中可能会有空格...我已经能够匹配类似这样的东西:hello“ world”并且只会得到hello。

Is there a way to write the regex I'm trying to? 有没有办法写我想写的正则表达式?

There are two ways to tackle this, depending on what you want to do with the output. 有两种方法可以解决此问题,具体取决于您要对输出执行的操作。

First, match (but don't capture) any text within quotation marks. 首先,匹配(但不捕获)引号内的任何文本。 (This is specifically matching the stuff that you DON'T want.) Using the | (这是专门匹配你不想要的东东)。使用| pipe, use capture groups to select everything that you DO want to keep. 管道,使用捕获组选择您要保留的所有内容。

Example: 例:

".*?"|(\b\S+\b)

You can see an example of that here . 您可以在此处查看示例。

The other option, using look-arounds, is to specifically look backward from the beginning of the words to ensure that the " doesn't appear there: 使用环顾四周的另一种方法是从单词的开头专门向后看,以确保"不会出现在此处:

(?<!")(\b\S+\b)(?!")

You can see that here . 您可以在这里看到。

This may have a problem when you start using multiple words, but this should get you on the right track, and you can indicate whether one of these methods works better for you than the other. 当您开始使用多个单词时,这可能会出现问题,但这应该使您走上正确的道路,并且可以指出这些方法中的一种是否比另一种更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM