简体   繁体   English

使用sed和regex用空间替换json值

[英]Substitute json value with space using sed and regex

I have multiple json file, which look like the sample below: 我有多个json文件,看起来像下面的示例:

#sample json    
{"urlCurrent":"https://www.website1.com/inside/377/388/408/8002.html?utm_source=source&utm_medium=Click&utm_campaign=123","id":"00001"}
{"urlCurrent":"https://127.0.0.1/inside/414/756/765/34984.html","id":"00002"}
{"urlCurrent":"https://msdn.anything.com/en-us","id":"00002"}
{"urlCurrent":"https://web.something.com/","id":"00002"}

I would like the json to become: 我希望json成为:

#result json    
{"urlCurrent":"https://www.website1.com/","id":"00001"}
{"urlCurrent":"https://127.0.0.1/","id":"00002"}
{"urlCurrent":"https://msdn.anything.com/","id":"00002"}
{"urlCurrent":"https://web.something.com/","id":"00002"}

I think that with 我认为

sed -i 's/{regular expression}/\ /g' sample.json

which is to substitute anything after / with space, the result can be achieved. 用空格代替/之后的任何东西,都可以实现结果。 However, I don't know how to use regular expression to match the pattern I need. 但是,我不知道如何使用正则表达式来匹配所需的模式。 Neither do I know which keyword I should search in order to achieve this. 我也不知道我应该搜索哪个关键字来实现这一目标。

Is there a way to truncate the urlCurrent to become the result I need? 有没有办法截断urlCurrent成为我需要的结果? Thanks in advance! 提前致谢!


12/23 Update This works: 12/23更新:

sed -E -i -r 's!(http|ftp|https)://([0-9a-zA-Z\.]+)([0-9a-zA-Z\/\.?#=_&%~+-]+)!\2!g' sample.json
sed -i -r 's/(.*:\/\/?[^\/]+\/?)[^\"]*(.*)/\1\2/' sample.json

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM