macOS sed - 复杂的替换命令

Question

I have a text file with a lot of lines and need to do some complex substitutions using macOS sed.我有一个包含很多行的文本文件，需要使用 macOS sed 进行一些复杂的替换。 It's a bit hard to explain my problem so I'll show you an example first:解释我的问题有点困难，所以我先给你看一个例子：

The file:文件：

#00101:A9AA%AAB
#03901:%E+2100009+X3800
#06008:01020304

Expected output:预期 output：

#00101:0000%A00
#03901:%E+2000000+X0000
#06008:01020304

For all lines starting with "#xxx01:" (where x represents any digit), I need to replace all alphanumeric characters (AZ, 0-9) with "0", except the numbers before the ":", and any two-character sequences starting with "%" or "+".对于以“#xxx01:”开头的所有行（其中 x 代表任何数字），我需要将所有字母数字字符（AZ，0-9）替换为“0”，“:”之前的数字除外，以及任何两个-以“%”或“+”开头的字符序列。

I am aware of the basic substitution and exception commands, as well as using "^" to search for a pattern at the start of a line, but I am confused as to how to combine all those commands.我知道基本的替换和异常命令，以及使用“^”在行首搜索模式，但我对如何组合所有这些命令感到困惑。 How should I go about doing this?我应该如何 go 这样做？ Non-sed solutions are welcome if this is impossible in sed.如果在 sed 中这是不可能的，欢迎使用非 sed 解决方案。

Answer 1

Create a file script.sed containing:创建一个文件script.sed包含：

/^#[0-9]{3}01:/ {
    :r
    s/:((0|[+%]..)*)[A-Za-z1-9]/:\10/
    t r
}

Call the file containing your sample input data data .调用包含您的示例输入数据data的文件。 Run the command shown to get the required output:运行显示的命令以获取所需的 output：

$ sed -E -f script.sed data
#00101:0000%AA0
#03901:%E+0000000+X3000
#06008:01020304
$

The option -E tells sed to use extended regular expressions.选项-E告诉sed使用扩展的正则表达式。 The option -f tells it to read the program from the file script.sed .选项-f告诉它从文件script.sed中读取程序。

The pattern /^#[0-9]{3}01:/ looks for lines starting with a # , followed by 3 digits, 01 and a colon.模式/^#[0-9]{3}01:/查找以#开头、后跟 3 位数字、 01和冒号的行。 The lines between { and } are executed for each matching line. {和}之间的行针对每个匹配行执行。

The line :r creates a label r that can be branched to with the b or t commands.行:r创建一个 label r可以使用b或t命令分支。 The t r branches to label r if there has been a successful s/// command since the last t command. t r分支到 label r如果自上一个t命令以来有一个成功s///命令。

The s/:((0|[+%]..)*)[A-Za-z1-9]/:\10/ command searches for the colon followed by any sequence of 0 s or +.. or %.. characters (where the dots match any character) and then followed by an alphanumeric character other than 0 . s/:((0|[+%]..)*)[A-Za-z1-9]/:\10/命令搜索冒号后跟任何0 s 或+..或%..字符（点匹配任何字符），然后是0以外的字母数字字符。 It replaces that with the colon, the remembered matches, and a 0 to replace the other alphanumeric character.它用冒号、记住的匹配项和0替换其他字母数字字符。 If you don't omit the 0 , you end up with an infinite loop.如果你不省略0 ，你最终会出现一个无限循环。

You can also use a command-line script instead of a script file, possibly with several -e options (one per line of the script file) or with a single script option and enough semicolons.您还可以使用命令行脚本而不是脚本文件，可能带有多个-e选项（脚本文件的每一行一个）或单个脚本选项和足够的分号。

macOS sed - 复杂的替换命令

问题描述

1 个解决方案

解决方案1
3 2021-04-11 06:05:26

macOS sed - 复杂的替换命令

问题描述

1 个解决方案

解决方案1 3 2021-04-11 06:05:26

解决方案1
3 2021-04-11 06:05:26