![](/img/trans.png)
[英]How can I strip out empty/invalid fields from json when using Json4s?
[英]How can I clean up empty fields when converting CSV to JSON with Miller?
我有几个 CSV 的项目数据文件,用于我正在玩的游戏,我需要将其转换为 JSON 以供使用。 数据可能非常不规则,每条记录有几个空字段,这使得 JSON output 有点难看。
具有虚拟值的示例:
Id,Name,Value,Type,Properties/1,Properties/2,Properties/3,Properties/4
01:Foo:13,Foo,13,ACME,CanExplode,IsRocket,,
02:Bar:42,Bar,42,,IsRocket,,,
03:Baz:37,Baz,37,BlackMesa,CanExplode,IsAlive,IsHungry,
转换为 output:
[
{
"Id": "01:Foo:13",
"Name": "Foo",
"Value": 13,
"Type": "ACME",
"Properties": ["CanExplode", "IsRocket", ""]
},
{
"Id": "02:Bar:42",
"Name": "Bar",
"Value": 42,
"Type": "",
"Properties": ["IsRocket", "", ""]
},
{
"Id": "03:Baz:37",
"Name": "Baz",
"Value": 37,
"Type": "BlackMesa",
"Properties": ["CanExplode", "IsAlive", "IsHungry"]
}
]
到目前为止,我在使用Miller
方面非常成功。 我设法从 CSV 中删除了完全空的列,并将Properties/X
列聚合到一个数组中。
但现在我想再做两件事来改进 output 格式,以便更轻松地使用 JSON:
Properties
数组中删除空字符串""
""
(例如第二条记录的Type
)替换为null
所需的 output:
[
{
"Id": "01:Foo:13",
"Name": "Foo",
"Value": 13,
"Type": "ACME",
"Properties": ["CanExplode", "IsRocket"]
},
{
"Id": "02:Bar:42",
"Name": "Bar",
"Value": 42,
"Type": null,
"Properties": ["IsRocket"]
},
{
"Id": "03:Baz:37",
"Name": "Baz",
"Value": 37,
"Type": "BlackMesa",
"Properties": ["CanExplode", "IsAlive", "IsHungry"]
}
]
有没有办法通过Miller
实现这一目标?
我当前的命令是:
mlr -I --csv remove-empty-columns file.csv
清理列mlr --icsv --ojson --jflatsep '/' --jlistwrap cat file.csv > file.json
用于转换这可能不是您想要的方式。 我也用jq。
跑步
mlr --c2j --jflatsep '/' --jlistwrap remove-empty-columns then cat input.csv | \
jq '.[].Properties|=map(select(length > 0))' | \
jq '.[].Type|=(if . == "" then null else . end)'
你将会有
[
{
"Id": "01:Foo:13",
"Name": "Foo",
"Value": 13,
"Type": "ACME",
"Properties": [
"CanExplode",
"IsRocket"
]
},
{
"Id": "02:Bar:42",
"Name": "Bar",
"Value": 42,
"Type": null,
"Properties": [
"IsRocket"
]
},
{
"Id": "03:Baz:37",
"Name": "Baz",
"Value": 37,
"Type": "BlackMesa",
"Properties": [
"CanExplode",
"IsAlive",
"IsHungry"
]
}
]
使用 Miller,您可以使用以下方法“过滤掉”每条记录中的空字段:
mlr --c2j --jflatsep '/' --jlistwrap put '
$* = select($*, func(k,v) {return v != ""})
' file.csv
备注:实际上,我们正在构建一个包含非空字段的新记录,而不是从记录中删除空字段; 最终结果是等价的:
[
{
"Id": "01:Foo:13",
"Name": "Foo",
"Value": 13,
"Type": "ACME",
"Properties": ["CanExplode", "IsRocket"]
},
{
"Id": "02:Bar:42",
"Name": "Bar",
"Value": 42,
"Properties": ["IsRocket"]
},
{
"Id": "03:Baz:37",
"Name": "Baz",
"Value": 37,
"Type": "BlackMesa",
"Properties": ["CanExplode", "IsAlive", "IsHungry"]
}
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.