How to use sed to replace u'sometext' with 'sometext'

Question

I have a file with text in it I simply want to strip off the leading u from all instances of u'sometext' so that it leaves 'sometext' . I haven't been able to figure out how to get sed to match on u' and replace with ' .

Sed command I though would work:

echo ['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null'] | sed "s/u'/'/g"

output:

[a, uupdate for microsoft office 2013 (kb4022166) 32-bit edition, unknown, null]

what I wanted:

['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

More examples of what is in the file:

"[u'cpe:/o:microsoft:windows_7::sp1:x64-enterprise', u'cpe:/a:adobe:acrobat:11.0.19']"

What I would like to have:

"['cpe:/o:microsoft:windows_7::sp1:x64-enterprise', 'cpe:/a:adobe:acrobat:11.0.19']"

Answer 1

Try, if possible, with something like this:

echo "['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']" | sed "s/u'/'/g"

OUTPUT:

['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

It seems that it is not taking well the complete string but assuming it as several ones.

Answer 2

You will need to use word boundaries, denoted with the special character \\b which goes immediately before the first thing to be matched on a boundary

 $ echo "[u'a', u'hello']" | sed "s/\bu'/'/g"
 ['a', 'hello']

Answer 3

$ echo "[u'a', u'hello', u'version 7-u']" | sed "s/u\('[^']*'\)/\1/g"
['a', 'hello', 'version 7-u']

$ echo "['a', u'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']" | sed "s/u\('[^']*'\)/\1/g"
['a', 'update for microsoft office 2013 (kb4022166) 32-bit edition', 'unknown', 'null']

$ echo "[u'cpe:/o:microsoft:windows_7::sp1:x64-enterprise', u'cpe:/a:adobe:acrobat:11.0.19']" | sed "s/u\('[^']*'\)/\1/g"
['cpe:/o:microsoft:windows_7::sp1:x64-enterprise', 'cpe:/a:adobe:acrobat:11.0.19']

Note though that both the above and the currently accepted answer would fail if you can have a u at the end of a single-quote-delimited string earlier in the line. eg:

$ echo "['u', 'a']" | sed "s/u\('[^']*'\)/\1/g"
['', 'a']

$ echo "['u', 'a']" | sed "s/\bu'/'/g"
['', 'a']

so, assuming that is an issue, we can use a more robust approach with awk (in this case using GNU awk for multi-char RS and RT):

$ echo "['u', 'a']" | awk -v RS="'[^']*'" -v ORS= 'RT{sub(/u$/,"")} {print $0 RT}'
['u', 'a']

$ echo "[u'a', u'hello', u'version 7-u']" | awk -v RS="'[^']*'" -v ORS= 'RT{sub(/u$/,"")} {print $0 RT}'
['a', 'hello', 'version 7-u']

How to use sed to replace u'sometext' with 'sometext'

Question

3 answers

solution1
2 2018-08-10 15:33:56

solution2
1 ACCPTED 2018-08-10 15:37:31

solution3
0 2018-08-12 14:58:29

How to use sed to replace u'sometext' with 'sometext'

Question

3 answers

solution1 2 2018-08-10 15:33:56

solution2 1 ACCPTED 2018-08-10 15:37:31

solution3 0 2018-08-12 14:58:29

solution1
2 2018-08-10 15:33:56

solution2
1 ACCPTED 2018-08-10 15:37:31

solution3
0 2018-08-12 14:58:29