简体   繁体   English

使用 sed 和正则表达式从文件中删除文本

[英]Using sed with regex to remove text from a file

I'm trying to use sed with regex to remove specific patterns from a file.我正在尝试使用 sed 和正则表达式从文件中删除特定模式。 Here is the command I'm using:这是我正在使用的命令:

sed '/location .*?\/\s+{(?:[\w\W]+?)}\s*(?=(?:location|$))/d'

Here is a sample text I'm testing with:这是我正在测试的示例文本:

location / {
                try_files $uri $uri/ /index.php?$args;
}
location ~*  .(jpg|jpeg|png|gif|ico|css|js)$ {
            expires 365d;
        }

        location ~*  .(pdf)$ {
            expires 30d;
        }

        location ~ \.php$
                {
                        try_files $uri =404;
                        fastcgi_split_path_info ^(.+\.php)(/.+)$;
                        include /etc/nginx/fastcgi_params;
                        fastcgi_index index.php;
                        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
                        include /etc/nginx/nginx_limits.conf;
                        if (-f $request_filename)
                        {
                                fastcgi_pass unix:/usr/local/php73/sockets/rgeoipdsm1.sock;
                        }
                }
        location ~ / {
                        try_files $uri $uri/ /index.php?$args;
        }

        location ^ / {
                        try_files $uri $uri/ /index.php?$args;
        }

        location ~ / {
                        if (-f $request_filename)
                        {
                                fastcgi_pass unix:/usr/local/php73/sockets/rgeoipdsm1.sock;
                        }
                        try_files $uri $uri/ /index.php?$args;
        }

location ~ \.png {

}

location ~ / {
}

What I'm expecting to see after running sed:运行 sed 后我期望看到的内容:

location ~*  .(jpg|jpeg|png|gif|ico|css|js)$ {
        expires 365d;
    }
    
    location ~*  .(pdf)$ {
        expires 30d;
    }

        location ~ \.php$
                {
                        try_files $uri =404;
                        fastcgi_split_path_info ^(.+\.php)(/.+)$;
                        include /etc/nginx/fastcgi_params;
                        fastcgi_index index.php;
                        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
                        include /etc/nginx/nginx_limits.conf;
                        if (-f $request_filename)
                        {
                                fastcgi_pass unix:/usr/local/php73/sockets/rgeoipdsm1.sock;
                        }
                }
    location ~ \.png {

}

But after running sed I get no changes at all, it seems to match nothing.但是在运行 sed 之后我没有得到任何改变,它似乎什么都不匹配。 Is it possible to fine-tune this command (perhaps with escape characters?) so it'll remove the required patterns?是否可以微调此命令(可能使用转义字符?)以便删除所需的模式?

This might work for you (GNU sed):这可能对您有用(GNU sed):

sed -E '/location ([~^] )?\/ \{/{:a;N;/^(\s*)location.*\n\1\}\s*$/d;ba}' file

Match the location required, then append lines until the white space before the closing } matches the opening lines initial white space and delete those line.匹配所需的location ,然后 append 行,直到关闭}之前的空白与开头行初始空白匹配并删除这些行。

You can use您可以使用

sed '/^[[:blank:]]*location .*\/[[:blank:]]*{[[:blank:]]*$/,/^[[:blank:]]*}[[:blank:]]*$/d' yourfile > newfile

This sed command finds blocks of lines between line matching ^[[:blank:]]*location.*\/[[:blank:]]*{[[:blank:]]*$ and a line matching ^[[:blank:]]*}[[:blank:]]*$ , and removes them (with d ).此 sed 命令查找行匹配^[[:blank:]]*location.*\/[[:blank:]]*{[[:blank:]]*$和行匹配^[[:blank:]]*}[[:blank:]]*$ ,并删除它们(使用d )。

The ^[[:blank:]]*location.*\/[[:blank:]]*{[[:blank:]]*$ regex matches ^[[:blank:]]*location.*\/[[:blank:]]*{[[:blank:]]*$正则表达式匹配

  • ^ - start of string ^ - 字符串的开头
  • [[:blank:]]* - zero or more spaces/tabs [[:blank:]]* - 零个或多个空格/制表符
  • location - a location string location - location字符串
  • .* - any zero or more chars .* - 任何零个或多个字符
  • \/ - a / char \/ - 一个/字符
  • [[:blank:]]*{[[:blank:]]* - a `{~ enclosed with zero or more spaces/tabs [[:blank:]]*{[[:blank:]]* - 一个 `{~ 包含零个或多个空格/制表符
  • $ - end of string. $ - 字符串结束。

The ^[[:blank:]]*}[[:blank:]]*$ regex matches a string that only contains a single } char eclosed with optional spaces/tabs. ^[[:blank:]]*}[[:blank:]]*$正则表达式匹配一个字符串,该字符串只包含一个带有可选空格/制表符的}字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM