[英]How does this sed command parse numbers with commas?
I'm having difficulty understanding a number-parsing sed command I saw in this article : 我很难理解我在本文中看到的数字解析sed命令:
sed -i ':a;s/\B[0-9]\{3\}\>/,&/;ta' numbers.txt
I'm a sed
newbie, so this is what I've been able to figure out: 我是sed
新手,所以这是我能够弄清楚的:
&
adds to what's already there rather than substitutes &
添加到已经存在的东西而不是替代品 :a; ... ;ta
:a; ... ;ta
:a; ... ;ta
calls the substitution recursively on the line until the search finds no more returns :a; ... ;ta
在行上递归调用替换,直到搜索找不到更多返回 Here's what I am hoping folks can explain 这是我希望人们能解释的
-i
do? -i
做什么? I can't seem to find it on the man pages though I'm sure it's there. 尽管我确定它在那里,但我似乎在手册页上找不到它。 \\B
is accomplishing here? 我对\\B
在这里要完成的工作有点不了解? Perhaps it helps with the left-right parsing priority, but I don't see how. 也许它有助于左右解析优先级,但是我不知道如何。 So lastly... 所以最后... 1234566778,9 ---> 1234,566,778,9
例如,命令的哪一部分阻止这样做: 1234566778,9 ---> 1234,566,778,9
Bisecting this command: 平分此命令:
sed -i ':a;s/\B[0-9]\{3\}\>/,&/;ta' numbers.txt
-i # inline editing to save changes in input file
\B # opposite of \b (word boundary) - to match between words
[0-9] # match any digit
\{3,\} # match exact 3 digits
\> # word boundary
& # use matched pattern in replacement
:a # start label a
ta # go back to label a until \B[0-9]\{3\}\> is matches
Yes indeed this sed command starts match/replacement from right most 3 digits and keeps going left till it finds 3 digits. 是的,确实,此sed命令从最右边的3位数字开始匹配/替换,并一直向左移动直到找到3位数字。
Update: However looking at this inefficient sed command in a loop I recommend this much simpler and faster awk instead: 更新:但是,在循环中查看此效率低下的 sed命令时,我建议使用更简单,更快速的awk :
awk '/^[0-9]+$/{printf "%\047.f\n", $1}' file
20,130,607,215,015
607,220,701
992,171
Where input file is: 输入文件在哪里:
cat file
20130607215015
607220701
992171
The matching is greedy, ie it matches the leftmost three digits NOT preceded by a word boundary and followed by the word boundary , ie the rightmost three digits. 匹配是贪婪的,即,它匹配没有单词边界的最左边的三个数字, 然后是单词边界的最右边的三个数字。 After inserting the comma, the "goto" makes it match again, but the comma introduced a new word boundary, so the match happens earlier. 插入逗号后,“ goto”使它再次匹配,但是逗号引入了新的单词边界,因此匹配会更早发生。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.