I need to get the number of four letter words using the grep command in the Linux shell. My idea was to create a list of four letter words and then use a pipe with | wc -l
| wc -l
.
I'm pretty new to Linux, but I have tried the following:
cat your_file | grep -c '^[ \t]*[a-zA-Z]\{5\}[ \t]*$'
and
grep -o -w "\w\{5\}" your_file
Use this Perl one-liner:
perl -lne 'print for /\b([A-Za-z]{4})\b/g' in_file
Example:
echo 'ABCD abcd abcd1 abcd_ Abcd,Abcd.' | perl -lne 'print for /\b([A-Za-z]{4})\b/g'
Output:
ABCD
abcd
Abcd
Abcd
The Perl one-liner uses these command line flags:
-e
: Tells Perl to look for code in-line, instead of in a file.
-n
: Loop over the input one line at a time, assigning it to $_
by default.
-l
: Strip the input line separator ( "\\n"
on *NIX by default) before executing the code in-line, and append it when printing.
[A-Za-z]{4}
: Any 4 letter word = a letter, uppercase or lowercase, exactly 4 occurrences.
([A-Za-z]{4})
: The above, parenthesis used to capture the 4 letter word.
\\b([A-Za-z]{4})\\b
: The above, flanked by a word boundary \\b
on both sides, which makes it a separate word.
print for /(...)/g
: iterate over the captured patterns and print all occurrences.
The regex uses this modifier:
/g
: Multiple matches.
SEE ALSO:perldoc perlrun
: how to execute the Perl interpreter: command line switchesperldoc perlre
: Perl regular expressions (regexes)
perldoc perlre
: Perl regular expressions (regexes): Quantifiers; Character Classes and other Special Escapes; Assertions; Capture groupsperldoc perlrequick
: Perl regular expressions quick start
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.