简体   繁体   English

使用正则表达式和sed替换文件内部的字符串

[英]Using regex and sed to replace a string inside of a file

Having the following string inside of a text file. 在文本文件中包含以下字符串。

{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This is a ' test string Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""} {“ _job”:“删除”,“查询”:{“查询”:{“布尔”:{“必须”:[{“术语”:{“ _ id”:“ 28381”}}}],“应该”: []}}},“ script”:{“ inline”:“ ctx._source.meta ='这是一个测试字符串Peedr'”},“ timestamp”:1518165383,“ host”:“”,“ port” :“ 9200”,“ index”:“”,“ docType”:“”,“ customIndexer”:“”}

I would like to replace all the ' that are inside the ctx._source.meta='' part with \\' using sed . 我想使用sedctx._source.meta=''部分中的所有'替换为\\'

In the example above I've This is a ' test string Peedr which I would like to convert to This is a \\' test string Peedr , so the desired output would be: 在上面的示例中,我有This is a ' test string Peedr ,我想将其转换为This is a \\' test string Peedr ,因此所需的输出为:

{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This is a \\' test string Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""} {“ _job”:“删除”,“查询”:{“查询”:{“布尔”:{“必须”:[{“术语”:{“ _ id”:“ 28381”}}}],“应该”: []}}},“ script”:{“ inline”:“ ctx._source.meta ='这是\\'测试字符串Peedr'”},“ timestamp”:1518165383,“ host”:“”,“ port “:” 9200“,” index“:”“,” docType“:”“,” customIndexer“:”“}

I'm using the following regex to get the ' that is inside the ctx._source.meta string (3rd capture group). 我正在使用以下正则表达式来获取ctx._source.meta字符串(第三捕获组)内部的'

(meta=')(.*?)(')(.*?)(')

I've the regex, but I dont know how to use the sed comand in order to replace the 3rd capture group with \\' . 我有正则表达式,但是我不知道如何使用sed comand来将第三个捕获组替换为\\'

Can someone give me a hand and tell me the sed comand I have to use? 有人可以帮我告诉我必须使用的sed命令吗?

Thanks in advance 提前致谢

sed generally does not support the Perl regex extensions, so the non-greedy .*? sed通常不支持Perl regex扩展,因此非贪心的.*? will probably not do what you hope. 可能不会做您希望的事情。 If you want to use Perl regex, use Perl! 如果要使用Perl正则表达式,请使用Perl!

perl -pe "s/(meta='.*?)(')(.*?')/\$1\\\\\$2\$3/"

This will still not necessarily work if the input is malformed; 如果输入格式错误,这仍然不一定有效; a better approach would be to specifically exclude single quotes from the match, and then you don't need the non-greedy matching. 更好的方法是从匹配中专门排除单引号,然后就不需要非贪婪的匹配。

sed "s/\\(meta='[^']*\\)'\\([^']*'\\)/\\1\\\\'\\2/"

In both cases, the number of backslashes required to escape the backslashes inside the shell's double quotes is staggering. 在这两种情况下,要在外壳的双引号内转义反斜杠所需的反斜杠数量是惊人的。

You put back-references to groups except one you want to replace. 您将反向引用放到要替换的组以外的组上。 There is a better way to accomplish same task: 有一种更好的方法可以完成相同的任务:

sed -E "s/(ctx\._source\.meta=')([^']*)(')([^']*')/\1\2\\'\4/"

You may use: 您可以使用:

sed "s/ ' / \\\' /g" sample.txt
  • The first part will instruct sed to only look for a single quote between 2 spaces, as such ctx._source.meta='This and string Peedr'"} will not match, hence will not be changed. 第一部分将指示sed仅查找2个空格之间的单引号,因为ctx._source.meta='This and string Peedr'"}将不匹配,因此不会被更改。

Edit: 编辑:

At the poster's request, I edited my sed command to apply to extra use cases: 应发布者的请求,我编辑了sed命令以应用于其他用例:

sed "s/\\(ctx._source.meta='.*\\)'\\(.*Peedr'\\"\\)/\\1\\\\\\'\\2/g"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM