简体   繁体   中英

Single-quote part of a line using sed or awk

Convert input text as follows, using sed or awk :

Input file:

       113259740 QA Test in progress
       219919630 UAT Test in progress

Expected output:

       113259740 'QA Test in progress'
       219919630 'UAT Test in progress'

Using GNU sed or BSD (OSX) sed :

sed -E "s/^( *)([^ ]+)( +)(.*)$/\1\2\3'\4'/" file
  • ^( *) captures all leading spaces, if any
  • ([^ ]+) captures the 1st field (a run of non-space characters of at least length 1)
  • ( +) captures the space(s) after the first field
  • (.*)$ matches the rest of the line, whatever it may be
  • \\1\\2\\3'\\4' replaces each (matching) input line with the captured leading spaces, followed by the 1st field, followed by the captured first inter-field space(s), followed by the single-quoted remainder of the input line. To discard the leading spaces, simply omit \\1 .

Note:

  • Matching the 1st field is more permissive than strictly required in that it matches any non-space sequence of characters, not just digits (as in the sample input data).
  • A generalized solution supporting other forms of whitespace (such as tabs), including after the 1st field, would look like this:

     sed -E "s/^([[:space:]]*)([^[:space:]]+)([[:space:]]+)(.*)$/\\1\\2\\3'\\4'/" file 

If your sed version doesn't support -E (or -r ) to enable support for extended regexes, try the following, POSIX-compliant variant that uses a basic regex:

 sed "s/^\( *\)\([^ ]\{1,\}\)\( \{1,\}\)\(.*\)$/\1\2\3'\4'/" file

And in awk :

awk '{ printf "%s '"'"'", $1; for (i=2; i<NF; ++i) printf "%s ", $i; print $NF "'"'"'" }' file

Explanation:

  • printf "%s '"'"'", $1; Print the first field, followed by a space and a quote ( ' )
  • for (i=2; i<NF; ++i) printf "%s ", $i; Print all of the following fields save the last one, each followed by a space.
  • print $NF "'"'"'" Print the last field followed by a quote( ' )

Note that '"'"'" is used to print just a single quote ( ' ). An alternative is to specify the quote character on the command line as a variable:

awk -v qt="'" '{ printf "%s %s", $1, qt; for (i=2; i<NF; ++i) printf "%s ", $i; print $NF qt }' file

You could try this GNU sed command also,

sed -r "s/^( +) ([0-9]+) (.*)$/\1 \2 '\3'/g" file
  • ^( +) , catches one or more spaces at the starting and stored it in a group(1).

  • ([0-9]+) - After catching one or more spaces at the starting, next it matches a space after that and fetch all the numbers that are next to that space then store it in a group(2).

  • (.*)$ - Fetch all the characters that are next to numbers upto the last character and then store it in a group(3).

  • All the fetched groups are rearranged in the replacement part according to the desired output.

Example:

$ cat ccc
       113259740 QA Test in progress
       219919630 UAT Test in progress

$ sed -r "s/^( +) ([0-9]+) (.*)$/\1 \2 '\3'/g" ccc
       113259740 'QA Test in progress'
       219919630 'UAT Test in progress'

You can perform this by taking advantage of the word-splitting involved in most shells like bash. To avoid ending up with an extra single quote in the final result, you can just remove it with sed. This will also trim any extra spaces before i, between i and j and after j.

cat file.txt | sed "s/'//g" | while read ij; do echo "$i '$j'"; done

Here, we'll pipe the first word into variable i, and the rest in j.

An awk solution:

awk -v q="'" '{ f1=$1; $1=""; print f1, q substr($0,2) q }' file
  • Lets awk split each input line into fields by whitespace (the default behavior).
  • -vq="'" defines awk variable q containing a single quote so as to make it easier to use a single quote inside the awk program, which is single-quoted as a whole.
  • f1=$1 saves the 1st field for later use.
  • $1=="" effectively removes the first field from the input line, leaving $0 , which originally referred to the whole input line, to contain a space followed by the rest of the line (strictly speaking, the fields are re-concatenated using the output-field separator OFS , which defaults to a space; since the 1st field is now empty, the resulting $0 starts with a single space followed by all remaining fields separated by a space each).
  • print f1, q substr($0,2) q then prints the saved 1st field, followed by a space ( OFS ) due to , , followed by the remainder of the line (with the initial space stripped with substr() ) enclosed in single quotes ( q ).

Note that this solution normalizes whitespace:

  • leading and trailing whitespace is removed
  • interior whitespace of length greater than 1 is compressed to a single space each.

Since the post is tagged with bash , here is an all Bash solution that preserves leading white space.

while IFS= read -r line; do
    read -r f1 f2 <<<"$line"
    echo "${line/$f1 $f2/$f1 $'\''$f2$'\''}"
done < file

Output:

       113259740 'QA Test in progress'   
       219919630 'UAT Test in progress'

Here is a simple way to do it with awk

awk '{sub($2,v"&");sub($NF,"&"v)}1' v=\' file
       113259740 'QA Test in progress'
       219919630 'UAT Test in progress'

It does not change the formatting of the file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM