[英]Replace first two whitespace occurrences with a comma using sed
I have a whitespace delimited file with a variable number of entries on each line.我有一个空格分隔的文件,每行有可变数量的条目。 I want to replace the first two whitespaces with commas to create a comma delimited file with three columns.我想用逗号替换前两个空格,以创建一个包含三列的逗号分隔文件。
Here's my input:这是我的输入:
a b 1 2 3 3 2 1
c d 44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z y 2 3 33
And here's my desired output:这是我想要的 output:
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33
I'm trying to use perl regular expressions in a sed command but I can't quite get it to work.我正在尝试在 sed 命令中使用 perl 正则表达式,但我不能让它工作。 First I try capturing a word, followed by a space, then another word, but that only works for lines 1, 2, and 5:首先我尝试捕获一个单词,然后是一个空格,然后是另一个单词,但这仅适用于第 1、2 和 5 行:
$ cat test | sed -r 's/(\w)\s+(\w)\s+/\1,\2,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z,y,2 3 33
I also try capturing whitespace, a word, and then more whitespace, but that gives me the same result:我也尝试捕获空格、一个单词,然后是更多的空格,但这给了我相同的结果:
$ cat test | sed -r 's/\s+(\w)\s+/,\1,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z,y,2 3 33
I also try doing this with the.?我也尝试用.? wildcard, but that does something funny to line 4.通配符,但这对第 4 行来说很有趣。
$ cat test | sed -r 's/\s+(.?)\s+/,\1,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh,,77 88 99
z,y,2 3 33
Any help is much appreciated!任何帮助深表感谢!
How about this:这个怎么样:
sed -e 's/\s\+/,/' | sed -e 's/\s\+/,/'
It's probably possible with a single sed command, but this is sure an easy way:)使用单个 sed 命令可能是可能的,但这肯定是一种简单的方法:)
My output:我的 output:
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33
Try this:尝试这个:
sed -r 's/\s+(\S+)\s+/,\1,/'
Just replaced \w
(one "word" char) with \S+
(one or more non-space chars) in one of your attempts.只是在您的一次尝试中将\w
(一个“单词”字符)替换为\S+
(一个或多个非空格字符)。
You can provide multiple commands to a single instance of sed
by just providing multiple -e
arguments.您只需提供多个-e
arguments,即可向sed
的单个实例提供多个命令。
To do the first two, just use:要做前两个,只需使用:
sed -e 's/\s\+/,/' -e 's/\s\+/,/'
This basically runs both commands on the line in sequence, the first doing the first block of whitespace, the second doing the next.这基本上是按顺序在行上运行两个命令,第一个执行第一个空白块,第二个执行下一个。
The following transcript shows this in action:以下记录显示了这一点:
pax$ echo 'a b 1 2 3 3 2 1
c d 44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z y 2 3 33
' | sed -e 's/\s\+/,/' -e 's/\s\+/,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33
Sed s///
supports a way to say which occurrence of a pattern to replace: just add the n
to the end of the command to replace only the n
th occurrence. Sed s///
支持一种方式来说明要替换哪个模式的出现:只需将n
添加到命令末尾以仅替换第n
次出现。 So, to replace the first and second occurrences of whitespace, just use it this way:因此,要替换第一次和第二次出现的空白,只需这样使用它:
$ sed 's/ */,/1;s/ */,/2' input
a,b ,1 2 3 3 2 1
c,d ,44 55 66 2355
line,http://google.com 100,200 300
ef,jh ,77 88 99
z,y 2,3 33
EDIT : reading another proposed solutions, I noted that the 1
and 2
after s/ */,/
is not only unnecessary but plainly wrong.编辑:阅读另一个建议的解决方案,我注意到s/ */,/
之后的1
和2
不仅没有必要,而且显然是错误的。 By default, s///
just replaces the first occurrence of the pattern.默认情况下, s///
只替换第一次出现的模式。 So, if we have two identical s///
in sequence, they will replace the first and the second occurrence.因此,如果我们有两个相同s///
序列,它们将替换第一个和第二个出现。 What you need is just你需要的只是
$ sed 's/ */,/;s/ */,/' input
(Note that you can put two sed commands in one expression if you separate them by a semicolon. Some sed implementations do not accept the semicolon after the s///
command; use a newline to separate the commands, in this case.) (请注意,如果用分号分隔两个 sed 命令,则可以将它们放在一个表达式中。某些 sed 实现不接受s///
命令后的分号;在这种情况下,使用换行符分隔命令。)
A Perl solution is: Perl 解决方案是:
perl -pe '$_=join ",", split /\s+/, $_, 3' some.file
Not sure about sed/perl, but here's an (ugly) awk solution.不确定 sed/perl,但这是一个(丑陋的)awk 解决方案。 It just prints fields 1-2, separated by commas, then the remaining fields separated by space:它只打印字段 1-2,以逗号分隔,然后打印其余字段以空格分隔:
awk '{
printf("%s,", $1)
printf("%s,", $2)
for (i=3; i<=NF; i++)
printf("%s ", $i)
printf("\n")
}' myfile.txt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.