简体   繁体   English

Octave - 使用正则表达式在字符串中查找单词

[英]Octave - Finding words in a string using regex

In Octave, I am finding words ending with only whitespaces, or either a comma or a period followed by whitespace(s).在 Octave 中,我发现单词只以空格结尾,或者逗号或句号后跟空格。

The following is my code:以下是我的代码:

str = 'Hello, I am kjd#(@*#@m, aa.aa.aa.aa. It was nice meeting you.';
regexp(str, "\[a-zA-Z]+\[,.]?\s+", 'match')

This should return the words Hello , I , am , It , was , nice , meeting , you .这应该返回单词Hello , I , am , It , was , nice , meeting , you However, it only returns was .但是,它只返回was I'm having a hard time figuring this out.我很难弄清楚这一点。

I've also tried tried this answer: https://stackoverflow.com/a/29174222/6213337 , but it returns ans = {}(1x0) .我也试过试过这个答案: https : ans = {}(1x0) ,但它返回ans = {}(1x0)

Any ideas?有任何想法吗? Thanks.谢谢。

Matlab uses PCRE regex flavor, thus, the regex pattern you need can be short and compact and quite comprehensive: Matlab 使用 PCRE regex 风格,因此,您需要的 regex 模式可以简短而紧凑且非常全面:

str = 'Hello, I am kjd#(@*#@m, aa.aa.aa.aa. It was nice meeting you.';
regexp(str, "(?<!\\S)\\p{L}++(?!\\p{P}\\S)", 'match')
print match

See the regex and IDEONE demos.请参阅正则表达式IDEONE演示。

The regex matches:正则表达式匹配:

  • (?<!\\S) - check if there is no non-whitespace character before the current location in string, and if there is not, go on matching.... (?<!\\S) - 检查字符串中当前位置之前是否没有非空白字符,如果没有,继续匹配......
  • \\p{L}++ - any 1+ letters (possessively, not allowing backtracking, thus, the next check will only be performed once after the last letter matched) that are NOT followed with... \\p{L}++ - 任何 1+ 个字母(当然,不允许回溯,因此,下一个检查只会在最后一个字母匹配后执行一次),后面没有...
  • (?!\\p{P}\\S) - any punctuation and then a non-whitespace ( (?!...) is a negative lookahead that fails a match if its subpattern matches to the right of the current location in the string). (?!\\p{P}\\S) - 任何标点符号,然后是一个非空格( (?!...)是一个否定前瞻,如果其子模式匹配字符串中当前位置的右侧,则匹配失败)。

Try this尝试这个

str = 'Hello, I am kjd#(@*#@m, aa.aa.aa.aa. It was nice meeting you.';
regexp(str, "(?:^|\\s+)([a-zA-Z]+)(?=[,.]?(?:$|\\s))", 'matches')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM