简体   繁体   English

使用sed(或任何其他工具)删除json文件中的引号

[英]Using sed (or any other tool) to remove the quotes in a json file

I have a json file 我有一个json文件

{"doc_type":"user","requestId":"1000778","clientId":"42114"}

I want to change it to 我想将其更改为

{"doc_type":"user","requestId":1000778,"clientId":"42114"}

ie convert the requestId from String to Integer. 即将requestId从String转换为Integer。 I have tried some ways, but none seem to work : 我已经尝试了一些方法,但是似乎都没有用:

sed -e 's/"requestId":"[0-9]"/"requestId":$1/g' test.json
sed -e 's/"requestId":"\([0-9]\)"/"requestId":444/g' test.json 

Could someone help me out please? 有人可以帮我吗?

Try 尝试

sed -e 's/\("requestId":\)"\([0-9]*\)"/\1\2/g' test.json

or 要么

sed -e 's/"requestId":"\([0-9]*\)"/"requestId":\1/g' test.json

The main differences with your attempts are: 您尝试的主要区别是:

  • Your regular expressions were looking for [0-9] between double quotes, and that's a single digit. 您的正则表达式在双引号之间寻找[0-9] ,这是一个数字。 By using [0-9]* instead you are looking for any number of digits (zero or more digits). 通过使用[0-9]*您正在寻找任意数量的数字(零个或多个数字)。

  • If you want to copy a sequence of characters from your search in your replacing string, you need to define a group with a starting \\( and a final \\) in the regexp, and then use \\1 in the replacing string to insert the string there. 如果要从替换字符串中的搜索中复制字符序列,则需要在正则表达式中定义一个以\\(和最后一个\\)开头的组,然后在替换字符串中使用\\1插入字符串那里。 If there are multiple groups, you use \\1 for the first group, \\2 for the second group, and so on. 如果有多个组,则将\\1用于第一组,将\\2用于第二组,依此类推。

Also note that the final g after the last / is used to apply this substitution in all matches, in every processed line. 还要注意,最后一个/之后的最后一个g用于在所有已处理的行的所有匹配项中应用此替换。 Without that g , the substitution would only be applied to the first match in every processed line. 如果没有该g ,替换将仅应用于每个处理行中的第一个匹配项。 Therefore, if you are only expecting one such replacement per line, you can drop that g . 因此,如果只希望每行替换一次,则可以删除g

Since you said "or any other tool", I'd recommend jq! 由于您说的是“或任何其他工具”,所以我建议您使用jq! While sed is great for line-based, JSON is not and sometimes newlines are added in just for pretty printing the output to make developers' lives easier. 尽管sed非常适合基于行的应用程序,但JSON却不是,有时添加换行符只是为了漂亮地打印输出内容,以使开发人员的生活更轻松。 It's rules also get even more tricky when handling Unicode or double-quotes in string content. 处理字符串内容中的Unicode或双引号时,它的规则也会变得更加棘手。 jq is specifically designed to understand the JSON format and can dissect it appropriately. jq是专门为了解JSON格式而设计的,可以对其进行适当的剖析。

For your case, this should do the job: 对于您的情况,这应该可以完成工作:

jq '.requestId = (.requestId | tonumber)'

Note, this will throw an error if requestId is missing and not output the JSON object. 请注意,如果requestId丢失并且不输出JSON对象,这将引发错误。 If that's a concern, you might need something a little more sophisticated like this example: 如果这是一个问题,那么您可能需要类似以下示例的更高级的东西:

jq 'if has("requestId") then .requestId = (.requestId | tonumber) else . end'

Also, jq does pretty-print and colorize it's output if sent to a terminal. 而且,jq如果发送到终端,则会进行漂亮打印并将其输出着色。 To avoid that and just see a compact, one-line-per-object format, add -Mc to the command. 为了避免这种情况,并且只看到一种紧凑的,每对象一行的格式, -Mc在命令中添加-Mc jq will also work if provided multiple objects back-to-back without a newline in the input. 如果在输入中没有换行的情况下提供了多个对象,则jq也将起作用。 Here's a full-demo to show this filter: 这是显示此过滤器的完整演示:

$ (echo '{"doc_type":"bare"}{}'
   echo '{"doc_type":"user","requestId":"0092","clientId":"11"}'
   echo '{"doc_type":"user","requestId":"1000778","clientId":"42114"}'
) | jq 'if has("requestId") then .requestId = (.requestId | tonumber) else . end' -Mc

Which produced this output: 产生此输出的内容:

{"doc_type":"bare"}
{}
{"doc_type":"user","requestId":92,"clientId":"11"}
{"doc_type":"user","requestId":1000778,"clientId":"42114"}
sed -e 's/"requestId":"\([0-9]\+\)"/"requestId":\1/g' test.json

You were close. 你近了 The "new" regex terms I had to add: \\1 means "whatever is contained in the first \\( \\) on the "search" side, and \\+ means "1 or more of the previous thing". 我必须添加的“新”正则表达式术语: \\1表示“在“搜索”侧的第一个\\( \\)包含的内容,而\\+表示“先前内容的1个或多个”。

Thus, we search for the string "requestId":" followed by a group of 1 or more digits, followed by " , and replace it with "requestId": followed by that group we found earlier. 因此,我们搜索字符串"requestId":"后跟1个或多个数字的组,后跟" ,然后将其替换为"requestId":后跟我们之前发现的那个组。

Perhaps the jq (json query) tool would help you out? 也许jq(json查询)工具可以帮助您?

$ cat test                                                  
{"doc_type":"user","requestId":"1000778","clientId":"42114"}
$ cat test |jq '.doc_type' --raw-output                     
user                                                        
$           

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM