简体   繁体   English

匹配特定长度的单词,锚定,不做魔术数学

[英]Match specific length words, anchored, without doing magic math

Let's say I wanted to find all 12-letter words in /usr/share/dict/words that started with c and ended with er . 假设我想在/usr/share/dict/words中找到以c开头并以er结尾的所有12个字母的单词。 Off the top of my head, a workable pattern could look something like: 在我的头顶,一个可行的模式可能看起来像:

grep -E '^c.{9}er$' /usr/share/dict/words

It finds: 它发现:

cabinetmaker
calcographer
calligrapher
campanologer
campylometer
...

But that .{9} bothers me. 但那.{9}困扰我。 It feels too magical , subtracting the total length of all the anchor characters from the number defined in the original constraint. 感觉太神奇了 ,从原始约束中定义的数字减去所有锚字符的总长度。

Is there any way to rewrite this regex so it doesn't require doing this calculation up front, allowing a literal 12 to be used directly in the pattern? 有没有办法重写这个正则表达式,所以它不需要预先进行这个计算,允许直接在模式中使用文字12

You can use the -x option which selects only matches that exactly match the whole line. 您可以使用-x选项,该选项仅选择与整行完全匹配的匹配项。

grep -xE '.{12}' | grep 'c.*er'

Ideone Demo Ideone演示

Or use the -P option which clarifies the pattern as a Perl regular expression and use a lookahead assertion. 或者使用-P选项将模式阐明为Perl正则表达式并使用前瞻断言。

grep -P '^(?=.{12}$)c.*er$'

Ideone Demo Ideone演示

您可以使用awk作为替代方案并避免此计算:

awk -v len=12 'length($1)==len && $1 ~ /^c.*?er$/' file

I don't know grep so well, but some more advanced NFA RegEx implementations provide you with lookaheads and lookbehinds. 我不太了解grep ,但是一些更高级的NFA RegEx实现为您提供了前瞻和外观。 If you can figure out any means to make those available for you, you could write: 如果你能找到任何方法让你可以使用,你可以写:

^(?=c).{12}(?<=er)$

Maybe as a perl one-liner like this? 也许像这样的perl

cat /usr/share/dict/words | perl -ne "print if m/^(?=c).{12}(?<=er)$/"

One approach with GNU sed : GNU sed一种方法:

$ sed -nr '/^.{12}$/{/^c.*er$/p}' words

With BSD sed (Mac OS) it would be: 使用BSD sed (Mac OS)它将是:

$ sed -nE '/^.{12}$/{/^c.*er$/p;}' words

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM