简体   繁体   English

使用sed替换较大文本内部的定界列表

[英]Using sed to replace delimited lists inside larger body of text

I have a large file with many instances of variable length lists of numbers in square brackets, max one list per line, list is never empty, eg: 我有一个很大的文件,在方括号中有很多可变长度数字列表的实例,每行最多一个列表,列表永远不会为空,例如:

[1, 45, 54, 78] or [32] [1、45、54、78]或[32]

I want to get rid of the square brackets and the commas, eg: 我想摆脱方括号和逗号,例如:

1 45 54 78 or 32 1 45 54 78或32

I can successfully match them with this regex in sed: 我可以在sed中将它们与此正则表达式成功匹配:

\\[\\([0-9]*\\)\\(, \\([0-9]*\\)\\)*\\]

but I don't know how to use group numbers to refer to the groups I want, eg doing: 但是我不知道如何使用组号来指代我想要的组,例如:

sed  's/\\t\[\\([0-9]*\\)\\(, \\([0-9]*\\)\\)*\\]/\\t\\1 \\3/g'

will only result in the destination file getting the first and the last numbers in the list. 只会导致目标文件获得列表中的第一个和最后一个数字。

(I did solve my problem using awk, but am wondering if it can be done using sed) (我确实使用awk解决了我的问题,但想知道是否可以使用sed完成它)

Is there any way to refer to variable number of groups in sed? 有什么方法可以引用sed中可变数量的组?

How about: 怎么样:

sed 's/\[([\d ,]+)\]/\1/g' | sed 's/,//g'

Two separate commands - first extracts "stuff inside square brackets", second strips commas. 两个单独的命令-第一个提取“在方括号内的内容”,第二个提取逗号。

This awk should do: 这个awk应该做的:

awk '{gsub(/[][,]/,x)}1' file
1 45 54 78 or 32

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed -r ':a;/\[([0-9]+(, )*)+\]/!b;s//\n&\n/;h;s/[][,]//g;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/;ba' file

This finds the pattern, marks it with a newline either side and copies the entire line to the hold space. 这将找到模式,并在其任一侧用换行符标记,并将整行复制到保留空间。 It then deletes the brackets and commas in the pattern and recombines the altered with the original pattern and then repeats until no further patterns are found. 然后,它将删除模式中的括号和逗号,并将更改后的内容与原始模式重新组合在一起,然后重复执行直到找不到其他模式为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM