简体   繁体   English

Linux Bash脚本正则表达式故障

[英]Linux Bash Script Regex malfunction

I would like to make a bash script, which should decide about the given strings, if they fulfill the term or not. 我想制作一个bash脚本,该脚本应决定给定的字符串(如果它们满足该条件)。

The terms are: 这些术语是:

  • The string's first 3 character must be "le-" 字符串的前3个字符必须为“ le-”
  • Between hyphens there can any number of consonant in any arrangement, just one "e" and it cannot contain any vowel. 在连字符之间,可以以任何形式排列任意数量的辅音,只有一个“ e”,并且不能包含任何元音。
  • Between hyphens there must be something 在连字符之间必须有一些东西
  • The string must not end with hyphen 字符串不能以连字符结尾

I made this script: 我做了这个脚本:

#!/bin/bash
# Testing regex

while read -r line; do
        if [[ $line =~ ^le((-[^aeiou\W]*e+[^aeiou\W]*)+)$ ]]
        then
           printf "\""$line"\"\t\t\t-> True\n";
        else
           printf "\""$line"\"\t\t\t-> False\n";
        fi
done < <(cat "$@")

It does everything fine, except one thing: It says true no matter how many hyphens are next to each other. 它做的一切都很好,除了一件事:无论相邻多少个连字符,它都说正确。 For example: It says true for this string "le--le" 例如:对于字符串“ le--le”,它为true

I tried this regex expression on websites (like this ) and they worked without this malfunction. 我想在网站上(这样的正则表达式表达 ),他们的工作没有此故障。 All I can think of there must be something difference between the web page and the linux bash. 我能想到的是,网页和linux bash之间一定有一些区别。 (All I can see on the web page is it runs PHP) (我在网页上只能看到它运行的是PHP)

Do you have got any idea, how could I make it work ? 你有什么主意,我该如何运作呢?

Thank you for your answers! 谢谢您的回答!

There's at least one problem with your regex: [^aeiou\\W] - a negated "non-word", means "word" - and it matches any letter , consonants included. 您的正则表达式至少存在一个问题: [^aeiou\\W] -否定的“非单词”,意思是“单词”,并且它与任何字母 (包括辅音) 匹配 Character classes are inclusive, not exclusive. 字符类是包含性的,而不是排他性的。 We're better off just listing all the consonants (and for you case, we'll add 'e' and '-' to the set as well). 我们最好只列出所有辅音(对于您来说,我们也将“ e”和“-”添加到集合中)。

So try this one : ( edit : using @Laurel's more concise char class) 所以试试这个 :( 编辑 :使用@Laurel的更简洁的char类)

`(?=^le-)(?!.*--)(?!.*-[^-]*e[^-]*e[^-]*-)[b-hj-np-tv-z-]*[^-]$`
  • (?=^le-) starts with 'le-' (?=^le-)以'le-'开头
  • (?!.*--) no double dashes allowed (?!.*--)不允许双破折号
  • (?!.*-[^-]*e[^-]*e[^-]*-) do NOT see two e's between dashes (?!.*-[^-]*e[^-]*e[^-]*-)在破折号之间看不到两个e
  • [b-hj-np-tv-z-]* - consume consonants, e, and dashes (same as [bcdfghjklmnpqrstlvwze-] ) [b-hj-np-tv-z-]* -消耗辅音,e和破折号(与[bcdfghjklmnpqrstlvwze-]相同)
  • [^-]$ last character must be non-dash [^-]$最后一个字符必须为非破折号

sweaver2112 rightly points out that the \\W is causing you problems, but fails to provide a working example of a bash test regex that does what you ask (at least, i couldn't get it to work). sweaver2112正确地指出\\W导致了您的问题,但是未能提供可以满足您要求的bash测试正则表达式的有效示例(至少,我无法使其正常工作)。

this seems to do it (adapting Laurel 's consonant regex): 这似乎做到了(适应Laurel的辅音正则表达式):

[[ "$line" =~ ^le(-[b-df-hj-np-tv-z]*e[b-df-hj-np-tv-z]*)+$ ]]

it matches (eg): 它匹配(例如):

le-e
le-e-le
le-e-e-e-e-e

and more generally: 更一般地:

le-([[:consonant:]]*e[[:consonant:]]*)+

and doesn't match (eg): 并且不匹配(例如):

le-
le--le
le-lea-le

also, you can write it more cleanly this way: 此外,您可以通过以下方式更清晰地编写它:

c='[b-df-hj-np-tv-z]'
[[ "$line" =~ ^le(-$c*e$c*)+$ ]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM