用组替换perl正则表达式

Question

I have the following json input 我有以下json输入

... "somefield":"somevalue", "time":"timevalue", "anotherfield":"value" ...

inside my ksh script I wish to replace timevalue with my value. 在我的ksh脚本中，我希望将timevalue替换为我的值。 So I created this regular expression using groups with works just fine 所以我用工作组创建了这个正则表达式

data=`cat somefile.json`
echo $data | perl -pe "s|(.*time\"\s*\:\s*\").*?(\".*)|\1%TIME%\2|g" | another-script.sh

... "somefield":"somevalue", "time":"%TIME%", "anotherfield":"value" ...

However ... I cannot use number as substitution because perl uses numbers to define groups .. so this one obviously doen't work 但是...我不能用数字代替，因为perl使用数字来定义组..所以这个显然不起作用

perl -pe "s|(.*time\"\s*\:\s*\").*?(\".*)|\120:00:00\2|g"

I can overcome this by doing two step substitution 我可以通过两步替换来克服这个问题

perl -pe "s|(.*time\"\s*\:\s*\").*?(\".*)|\1%TIME%\2|g" | perl -pe "s|%TIME%|20:00:00|"

... "somefield":"somevalue", "time":"20:00:00", "anotherfield":"value" ...

but I am sure there is a better and more elegant way doing it 但我敢肯定，有更好，更优雅的方法

Answer 1

Whilst you could do this with regexes, it would be so much easier with the right tool 虽然您可以使用正则表达式执行此操作，但是使用正确的工具会容易得多

jq '.time="20:00:00"' somefile.json

If you particularly wish to use Perl, the core Perl distribution has included a JSON parser since 2011, so you could do something like: 如果您特别希望使用Perl，Perl的核心发行版自2011年起就包含JSON解析器，因此您可以执行以下操作：

perl -MJSON::PP=decode_json,encode_json -0 -E '$j = decode_json(<>); $j->{time} = "20:00:00"; say encode_json($j)' somefile.json

Answer 2

Perl doesn't use \\1 for substitution. Perl不使用\\1进行替换。 If you had enabled warnings (eg with perl -w ), perl would have told you it's $1 . 如果您启用了警告（例如，使用perl -w ），则perl会告诉您它是$1 。 Which can be disambiguated from surrounding digits by adding { } : 可以通过添加{ }与周围的数字区分开来：

perl -pe 's|(.*time"\s*:\s*").*?(".*)|${1}20:00:00$2|g'

(I also removed all the redundant backslashes from the regex.) （我还从正则表达式中删除了所有多余的反斜杠。）

On another note, what's the point of matching .* if you're just going to replace it by itself? 另一方面，如果只想自己替换它，那么匹配.*有什么意义呢？ Couldn't it just be 不能只是

perl -pe 's|(time"\s*:\s*").*?(")|${1}20:00:00$2|g'

? ？

I'm not a big fan of .* or .*? 我不是.*或.*?忠实粉丝.*? . 。 If you're trying to match the inside of a quoted string, it would be better to be specific: 如果您尝试匹配带引号的字符串的内部，则最好进行具体说明：

perl -pe 's|(time"\s*:\s*")[^"]*(")|${1}20:00:00$2|g'

We're not trying to validate the input string, so now there's really no reason to match that final " (and replace it by itself) either: 我们并没有尝试验证输入字符串，因此，现在实际上也没有理由匹配最后一个" （并用它自己替换）”：

perl -pe 's|(time"\s*:\s*")[^"]*|${1}20:00:00|g'

If your perl is not ancient (5.10+), you can use \\K to "keep" leading parts of the string, ie not include it in the match: 如果您的perl不是古老的（5.10+），则可以使用\\K来“保留”字符串的前导部分，即不要将其包括在匹配项中：

perl -pe 's|time"\s*:\s*"\K[^"]*|20:00:00|g'

Now only the [^"]* part will be substituted, saving us from having to do any capturing. 现在只有[^"]*部分将被替换，从而使我们不必进行任何捕获。

用组替换perl正则表达式

问题描述

2 个解决方案

解决方案1
6 2018-05-10 07:22:27

解决方案2
4 已采纳 2018-05-10 07:29:04

用组替换perl正则表达式

问题描述

2 个解决方案

解决方案1 6 2018-05-10 07:22:27

解决方案2 4 已采纳 2018-05-10 07:29:04

解决方案1
6 2018-05-10 07:22:27

解决方案2
4 已采纳 2018-05-10 07:29:04