简体   繁体   中英

Remove Lines less than 3 words in a Text File

I have seen commands such as using sed to remove lines based on number of characters but not words.

eg. I have a text file such as

word1
word1 word2
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5

How would i use (sed or awk) to remove the lines with less than 3 words so output looks like:

word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5

Here is how to do it with awk , If its more than 2 fields, print it:

awk 'NF>2' file
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5

You could do this simply in awk,

$ awk 'NF>=3' file
word1 word2 word3
word1 word2 word3 word4
word1 word2 word4 word5

It prints the lines which has three or more fields.

You can try is sed command

sed -n 's/\([^ ]\+ \)\{2,\}/&/p' file_name

[^ ] - until space match each characters
{2,} - which is used to match the preceding pattern more than 2
([^ ]\+ ) - Which is used to match the word.
sed -n '/[^ ]\([^ ]*  *[^ ]\)\{2\}/ p' YourFile
# or
sed -n '/[^ ]  *[^ ][^ ]*  *[^ ]/ p' YourFile

Regx is: At least 1 non space with at least 1 space with at least 1 non space with at least 1 space with at least 1 non space

to ensure that ( word1 word2 ) is not taking sourround space as word separator with no word to separe at the extremities

这可能对您有用(GNU sed):

sed -n 's/\<//3p' file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM