简体   繁体   中英

bash find and replace - sed awk

I'm trying to clean up a text file with over 120,000 lines via a bash script. I need to perform several find and replaces. The order of each find and replace is important and the file needs to 'remember' the previous find and replaces.

example: replace all '.'(period) with '.\n' (period and new line), then

replace all '?'(questions marks) with '?\n' (questionmark and new line), then

replace all ','(period) with '.\n' (exclamation and new line). then... etc..

I'm doing this, but it's not working:

#!/usr/bin/env bash

sed 's/./.\n/g'
sed 's/?/?\n/g'
sed 's/!/!\n/g'
input.txt

What am I doing wrong?

Is sed or awk better for what I'm trying to achieve?

You may always pipe sed commands, but in this case it makes sense to combine all the conditions into one command:

sed 's/[.!?]/&\n/g' file > newfile

The [.??] matches . , ! or ? and & in the replacement pattern puts the match value back into the string (the newline is added right after this value).

See the online demo :

s="This is a text. Want more? Yes! End"
sed 's/[.!?]/&\n/g' <<< "$s"

Output:

This is a text.
 Want more?
 Yes!
 End

If you need to get rid of the spaces after ? , ! and . use

sed 's/\([.!?]\)[[:space:]]*/\1\n/g' file > newfile

See another sed demo . Here:

  • \([.??]\) - Capturing group 1: matches . , ! or ?
  • [[:space:]]* - 0 or more whitespaces

The \1 in the replacement pattern refers to the value captured into Group 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM