简体   繁体   中英

How to use sed to replace u'sometext' with 'sometext'

I have a file with text in it I simply want to strip off the leading u from all instances of u'sometext' so that it leaves 'sometext' . I haven't been able to figure out how to get sed to match on u' and replace with ' .

Sed command I though would work:

echo ['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null'] | sed "s/u'/'/g"

output:

[a, uupdate for microsoft office 2013 (kb4022166) 32-bit edition, unknown, null]

what I wanted:

['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

More examples of what is in the file:

"[u'cpe:/o:microsoft:windows_7::sp1:x64-enterprise', u'cpe:/a:adobe:acrobat:11.0.19']"

What I would like to have:

"['cpe:/o:microsoft:windows_7::sp1:x64-enterprise', 'cpe:/a:adobe:acrobat:11.0.19']"

Try, if possible, with something like this:

echo "['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']" | sed "s/u'/'/g"

OUTPUT:

['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

It seems that it is not taking well the complete string but assuming it as several ones.

You will need to use word boundaries, denoted with the special character \\b which goes immediately before the first thing to be matched on a boundary

 $ echo "[u'a', u'hello']" | sed "s/\bu'/'/g"
 ['a', 'hello']
$ echo "[u'a', u'hello', u'version 7-u']" | sed "s/u\('[^']*'\)/\1/g"
['a', 'hello', 'version 7-u']

$ echo "['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']" | sed "s/u\('[^']*'\)/\1/g"
['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

$ echo "[u'cpe:/o:microsoft:windows_7::sp1:x64-enterprise', u'cpe:/a:adobe:acrobat:11.0.19']" | sed "s/u\('[^']*'\)/\1/g"
['cpe:/o:microsoft:windows_7::sp1:x64-enterprise', 'cpe:/a:adobe:acrobat:11.0.19']

Note though that both the above and the currently accepted answer would fail if you can have a u at the end of a single-quote-delimited string earlier in the line. eg:

$ echo "['u', 'a']" | sed "s/u\('[^']*'\)/\1/g"
['', 'a']

$ echo "['u', 'a']" | sed "s/\bu'/'/g"
['', 'a']

so, assuming that is an issue, we can use a more robust approach with awk (in this case using GNU awk for multi-char RS and RT):

$ echo "['u', 'a']" | awk -v RS="'[^']*'" -v ORS= 'RT{sub(/u$/,"")} {print $0 RT}'
['u', 'a']

$ echo "[u'a', u'hello', u'version 7-u']" | awk -v RS="'[^']*'" -v ORS= 'RT{sub(/u$/,"")} {print $0 RT}'
['a', 'hello', 'version 7-u']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM