简体   繁体   中英

Remove whitespaces between words and collapse double whitespaces in Bash

I have the following text:

T h i s  i s  s o m e  t e x t .

What I need is the following:

This is some text.

The structure follows a regular pattern, so I'd assume there is a way to perform the necessary modifications using a shell command (can also be a script of some sort). I'm not that proficient in shell-tools, so I couldn't come up with something that works..

Thanks in advance!

With sed you can do:

$ echo "$a"
T h i s  i s  s o m e  t e x t .

$ sed 's/\(.\) /\1/g' <<< "$a"
This is some text.

Perl to the rescue:

perl -pe 's/ (?! )//g' -- input.txt
  • (?! is a "negative look-ahead assertion", which means the whole pattern means a space not followed by a space

sed or perl variants mentioned above are the straight-forward ways to do it.

Here is awk variant as an example.:

$ awk -F'[ ]' '{for(i=1;i<=NF;i++){if ($i=="") $i=" "; printf "%s", $i}}' <<< 'T h i s  i s  s o m e  t e x t .'
This is some text.

$ awk -F'[ ]' '{for(i=1;i<=NF;i++){if ($i=="") $i=" "}}1' OFS= <<< 'T h i s  i s  s o m e  t e x t .'
This is some text.

You can use this gnu sed that is based on word boundaries:

s='T h i s  i s  s o m e  t e x t    .'

sed -E 's/\b( +\B| )//g' <<< "$s"

Output:

This is some text.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM