简体   繁体   English

正则表达式,用“。*”向后/向前看

[英]Regex, lookbehind/lookahead with “.*”

This word has to be taken with the space behind it 这个必须与后面的空格一起使用
word like this has to be taken too 像这样的也必须
If the word is like \\gloss{word}, \\(anything here)sezione{word}, \\gloss{anything word anything), \\(anything here)sezione{anything word anything}, it must not be taken. 如果是像\\ {光泽词},\\(任何这里)sezione {}字,\\ {光泽什么什么),\\(任何这里)sezione {什么什么},但千万不要上当。
If the word inside is like \\(anything but gloss or sezione){ word } and \\{anything but gloss or sezione){strings word strings} it has to be taken. 如果里面的单词像\\(除了光泽或sezione以外的任何事物){ 单词 }和\\ {除光泽或sezione以外的任何事物} {strings word string}都必须采用。
Obviously aword, worda and aworda has not to be taken. 显然,aword,worda和aworda不必使用。

( the bold word has been taken, word has not) 粗体字已被采用,单词未被)

I have problems in not catching the word that is inside "{.... word .....}" 我遇到的问题是无法捕获“ {.... word .....}”中的单词

My guess was (?<!(sezione\\{)|(gloss\\{))(\\b)( ?)word(\\b)(?!.*\\{}) so far, and I would have added a ".*" on the lookbehind and lookahead ( (?<!(sezione\\{)|(gloss\\{).*)[...] ) but like this it stops working. 我的猜测是(?<!(sezione\\{)|(gloss\\{))(\\b)( ?)word(\\b)(?!.*\\{}) ,到目前为止,我会添加一个“ 。*“(向前看和向后看( (?<!(sezione\\{)|(gloss\\{).*)[...] )),但是这样就停止了工作。

If this matter, I plan to use Java's regex engine 如果这件事,我计划使用Java的regex引擎

Thanks in advance 提前致谢

edit: the major problem is 编辑:主要问题是

\\(anything here)sezione{anything word anything} \\(此处有任何内容)sezione {任何有任何内容的单词 }

If I can NOT get this one, this should solve the whole problem 如果我不能得到这个,那应该可以解决整个问题

Let's set few hard facts about your use-case: 让我们为您的用例设定一些事实:

  1. Java (and most of) regex engines don't support variable length lookbehind Java(和大多数)正则表达式引擎不支持可变长度后视
  2. Java regex engine doesn't support \\K pattern that allows you to reset the search Java正则表达式引擎不支持\\K模式,该模式允许您重置搜索

In absence of that you will need to use a workaround which works in 3 steps: 在这种情况下,您将需要使用一种可通过3个步骤工作的解决方法

  1. Make sure input is matching expected lookbehind pattern 确保输入匹配预期的后lookbehind pattern
  2. If it does then remove matched String by lookbehind pattern 如果确实如此, lookbehind pattern删除匹配的字符串
  3. In the replaced String match and extract your search pattern 在替换的字符串匹配项中,提取您的搜索模式

Consider following code: 考虑以下代码:

String str = "(anything here)sezione{anything word anything}";
// look behind pattern
String lookbehind = "^.*?(?:sezione|gloss|word)\\{";
// make sure input is matching lookbehind pattern first
if (str.matches(lookbehind + ".*$")) {
        // actual search pattern
    Pattern p = Pattern.compile("[^}]*?\\b(word)\\b");
        // search in replaced String
    Matcher m = p.matcher(str.replaceFirst(lookbehind, ""));
    if (m.find())
        System.out.println(m.group(1));
        //> word
}

PS: You may need to improve code by checking for indexes in the input String for the starting point of search pattern. PS:您可能需要通过检查输入字符串中搜索模式起点的索引来改进代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM