[英]Export JSON to CSV with jq
I have the following JSON file:我有以下 JSON 文件:
{"data":[{"id":"DDM003","base":[{"date":"2020-06-04T00:30:00Z","value":335.2,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":330.1,"state":"A","validated":2}]},{"id":"DTR001","base":[{"date":"2020-06-04T00:30:00Z","value":0.2,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":0.1,"state":"A","validated":2}]},{"id":"FFM003","base":[{"date":"2020-06-04T00:30:00Z","value":2.62,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":3.15,"state":"A","validated":2}]},{"id":"RAIN12","base":[{"date":"2020-06-04T00:30:00Z","value":15.0,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":16.0,"state":"A","validated":2}]},{"id":"RHM003","base":[{"date":"2020-06-04T00:30:00Z","value":85.41,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":85.35,"state":"A","validated":2}]},{"id":"WVM003","base":[{"date":"2020-06-04T00:30:00Z","value":2.56,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":3.08,"state":"A","validated":2}]},{"id":"TLR001","base":[{"date":"2020-06-04T00:30:00Z","value":14.28,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":14.36,"state":"A","validated":2}]},{"id":"THR001","base":[{"date":"2020-06-04T00:30:00Z","value":14.07,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":14.23,"state":"A","validated":2}]},{"id":"PPR001","base":[{"date":"2020-06-04T00:30:00Z","value":999.2,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":998.9,"state":"A","validated":2}]},{"id":"RHR001","base":[{"date":"2020-06-04T00:30:00Z","value":80.5,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":80.0,"state":"A","validated":2}]},{"id":"WDR001","base":[{"date":"2020-06-04T00:30:00Z","value":317.71,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":320.31,"state":"A","validated":2}]},{"id":"WVR001","base":[{"date":"2020-06-04T00:30:00Z","value":2.75,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":2.33,"state":"A","validated":2}]},{"id":"WSR001","base":[{"date":"2020-06-04T00:30:00Z","value":2.91,"state":"A","validated":2},{"date":"2020-06-04T01:00:00Z","value":2.44,"state":"A","validated":2}]}]}
I would like to export it (and any file of this type with a larger date range) into the following CSV format using jq
(note the reordering of the fields):我想使用
jq
将它(以及任何具有更大日期范围的此类文件)导出为以下 CSV 格式(注意字段的重新排序):
date;WDR001;WVR001;WSR001;TLR001;THR001;DTR001;PPR001;RHR001;DDM003;WVM003;FFM003;RHM003;RAIN12
2020-06-04 00:30:00;317.71;2.75;2.91;14.28;14.36;0.2;999.2;80.5;335.2;2.56;2.62;85.41;15
2020-06-04 01:00:00;320.31;2.33;2.44;14.07;14.23;0.1;998.9;80;330.1;3.08;3.15;85.35;16
with the conditions that any value associated with a "state" attribute (which always exists) different from {A,R,O,W,K} or a "validated" attribute (which can be missing) different from 2 is set to the default value -9999.条件是,与不同于 {A,R,O,W,K} 的“状态”属性(始终存在)或不同于 2 的“已验证”属性(可能缺失)相关联的任何值都设置为默认值 -9999。
Could somebody please help me with the jq
filter needed to achieve this?有人可以帮我实现这一目标所需的
jq
过滤器吗?
Many thanks.非常感谢。
Using the -r command-line option, the following filter produces CSV as shown below:使用 -r 命令行选项,以下过滤器会生成 CSV,如下所示:
def adjust:
if (.state | test("^[AROWK]$") | not) or .validated != 2
then .value = -9999
else .
end;
["WDR001", "WVR001", "WSR001", "TLR001", "THR001", "DTR001", "PPR001", "RHR001", "DDM003", "WVM003", "FFM003", "RHM003", "RAIN12"] as $keys
| [.data[] | .id as $id | .base[] | . + {id: $id}]
| group_by(.date)[]
| map(adjust)
| .[0].date as $date
| ( INDEX(.[]; .id ) | map_values({value}) )
| [ $date, ($keys[] as $k | .[$k].value)]
| @csv
In words: store the.id in the.base objects;换句话说:将.id存储在.base对象中; group these objects by.date;
按日期对这些对象进行分组; make the adjustment of.value;
调整.value; create a mapping from.id to.value;
创建从.id 到.value 的映射; and finally emit the desired output.
最后发出所需的 output。
Output: Output:
"2020-06-04T00:30:00Z",317.71,2.75,2.91,14.28,14.07,0.2,999.2,80.5,335.2,2.56,2.62,85.41,15
"2020-06-04T01:00:00Z",320.31,2.33,2.44,14.36,14.23,0.1,998.9,80,330.1,3.08,3.15,85.35,16
It's easy enough to add the headers.添加标题很容易。
If you want semi-colon separated values without the quotation marks around the date, you could use join(;)
instead of @csv
.如果您想用分号分隔的值不带日期的引号,您可以使用
join(;)
而不是@csv
。
["WDR001", "WVR001", "WSR001", "TLR001", "THR001", "DTR001", "PPR001", "RHR001", "DDM003", "WVM003", "FFM003", "RHM003", "RAIN12"] as $keys
| [.data[] | .base[] + {id}]
| group_by(.date)[]
| map(adjust)
| .[0].date as $date
| ( INDEX(.[]; .id ) | map_values({value}) )
| [ $date, (.[$keys[]]|.value)]
| @csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.