sed替换文本匹配复杂的正则表达式模式

Question

I am in the process of porting an existing database schema to Postgresql. 我正在将现有数据库架构移植到Postgresql。

I need to replace occurrences of the word 'go' with a semi comma. 我需要用半逗号替换出现的单词“ go” 。

I have noticed that the word 'go' appears in the text, in the following pattern: 我注意到文本中以以下方式出现“开始”一词：

[non empty string (SQL)] [非空字符串（SQL）]
[followed by one or more new lines] [之后是一个或多个新行]
[followed by one or more white space] [后跟一个或多个空格]
[followed by the word 'go'] [后跟'去'这个词]
[Followed by one or more new lines] [之后是一个或多个新行]

I want to replace the above pattern with the following one: 我想将以下模式替换为以下模式：

[non empty string (SQL)] [非空字符串（SQL）]
[followed by ';'] [其次是 ';']
[Followed by TWO new lines] [后接两个新行]

I am trying to build a regex expression which I can use with sed, to perform the replacement described above - but I am relatively new to regex. 我正在尝试构建一个可与sed一起使用的正则表达式，以执行上述替换操作-但我对regex还是比较陌生。

For the purpose of clarity, I have included sample text BEFORE and AFTER the substitution I want to achieve: 为了清楚起见，我在要实现的替换之前和之后添加了示例文本：

-- Original File contents below -------



go
CREATE TABLE foobar
(
    f1    INT,
    f2    INT,
    f3    FLOAT,
        f4    VARCHAR(32) NOT NULL,
    f5    INT,
    f6    datetime,
        f7    smallint
)


go

GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr
go
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2)

go


--- REPLACED FILE CONTENTS -----------



go
CREATE TABLE foobar
(
    f1    INT,
    f2    INT,
    f3    FLOAT,
        f4    VARCHAR(32) NOT NULL,
    f5    INT,
    f6    datetime,
        f7    smallint
);

GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr;
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2);

Can anyone help with the expression to use to achieve this, so I can execute: sed -i 's/original_match_expr/replacement_expr/g' myfile.sql 任何人都可以帮助使用表达式来实现此目的，所以我可以执行： sed -i 's/original_match_expr/replacement_expr/g' myfile.sql

Answer 1

Try following solution using the GNU version of sed : 尝试使用GNU版本的sed遵循以下解决方案：

sed -ne ':a; $! { N; ba }; s/\([^[:space:]]\)[[:space:]]*go/\1;/g; p' infile

It reads the whole file to a buffer and replace all go words and all blanks that precede it with a semicolon. 它将整个文件读取到缓冲区，并用分号替换所有go单词和其前面的所有空格。 It yields: 它产生：

go
CREATE TABLE foobar
(
    f1    INT,
    f2    INT,
    f3    FLOAT,
        f4    VARCHAR(32) NOT NULL,
    f5    INT,
    f6    datetime,
        f7    smallint
);

GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr;
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2);

EDIT to add an explanation (see comments): 编辑添加说明（请参阅注释）：

It's not as hard as it seems. 这并不像看起来那么难。

:a; $! { N; ba } :a; $! { N; ba } is a loop that reads every line of input to a buffer. :a; $! { N; ba }是一个循环，它将输入的每一行读取到缓冲区。

[[:space:]] matches any whitespace character and [^[:space:]] negates it. [[:space:]]匹配任何空格字符，而[^[:space:]]则将其取反。 So the substitution command replaces from last non-whitespace character until the word go . 因此，替换命令从最后一个非空白字符开始替换，直到单词go为止。 If there is only whitespace before the go word as in the first case, the substitution doesn't match and does replace nothing. 如果像第一种情况那样在go单词之前只有空白，则替换不匹配并且不会替换任何内容。

Answer 2

With gawk 含盖克

awk -v RS='\\s*go' '{print $0""(RT ~ /go/? ";\n\n": "")}' file.txt

The record separator RS is set to 0 or more space characters followed by go . 记录分隔符RS设置为0或多个空格字符，然后设置go 。 GNU awk then treats the block of text between two successive instances of record separators as a record. 然后，GNU awk将两个连续的记录分隔符实例之间的文本块视为一条记录。 So print the record followed by a custom record separator ( ; followed by two newlines) 因此，先打印记录，然后再自定义记录分隔符（ ;然后是两个换行符）

sed替换文本匹配复杂的正则表达式模式

问题描述

2 个解决方案

解决方案1
1 2013-11-02 20:18:42

解决方案2
1 已采纳 2013-11-02 20:21:46

sed替换文本匹配复杂的正则表达式模式

问题描述

2 个解决方案

解决方案1 1 2013-11-02 20:18:42

解决方案2 1 已采纳 2013-11-02 20:21:46

解决方案1
1 2013-11-02 20:18:42

解决方案2
1 已采纳 2013-11-02 20:21:46