[英]Is there a simple way to convert a CSV with 0-indexed paths as keys to JSON with Miller?
Consider the following CSV:考虑以下 CSV:
email/1,email/2
abc@xyz.org,bob@pass.com
You can easily convert it to JSON (taking into account the paths defined by the keys) with Miller :您可以使用Miller轻松地将其转换为 JSON(考虑到键定义的路径):
mlr --icsv --ojson --jflatsep '/' cat file.csv
[ { "email": ["abc@xyz.org", "bob@pass.com"] } ]
Now, if the paths are 0-indexed in the CSV (which is surely more common):现在,如果路径在 CSV 中是 0 索引的(这肯定更常见):
email/0,email/1
abc@xyz.org,bob@pass.com
Then, without prior knowledge of the fields names , it seams that you'll have to rewrite the whole conversion:然后,在事先不了解字段名称的情况下,您似乎必须重写整个转换:
edit: replaced the hard-coded /
with FLATSEP
builtin variable:编辑:将硬编码
/
替换为FLATSEP
内置变量:
mlr --icsv --flatsep '/' put -q '
begin { @labels = []; print "[" }
# translate the original CSV header from 0-indexed to 1-indexed
NR == 1 {
i = 1;
for (k in $*) {
@labels[i] = joinv( apply( splita(k,FLATSEP), func(e) {
return typeof(e) == "int" ? e+1 : e
}), FLATSEP );
i += 1;
}
}
NR > 1 { print @object, "," }
# create an object from the translated labels and the row values
o = {};
i = 1;
for (k,v in $*) {
o[@labels[i]] = v;
i += 1;
}
@object = arrayify( unflatten(o,FLATSEP) );
end { if (NR > 0) { print @object } print "]" }
' file.csv
I would like to know if I'm missing something obvious, like a command line option or a way to rename the fields with the put
verb, or maybe something else?我想知道我是否遗漏了一些明显的东西,例如命令行选项或使用
put
动词重命名字段的方法,或者其他东西? You're also welcome to give your insights about the previous code, as I'm not really confident in my Miller's programming skills.也欢迎您对以前的代码提出您的见解,因为我对我的 Miller 的编程技能不太有信心。
Update:更新:
With @aborruso approach of pre-processing the CSV header, this could be reduced to:使用@aborruso 预处理 CSV header 的方法,这可以简化为:
note: I didn't keep the regextract
part because it means knowing the CSV header in advance.注意:我没有保留正则
regextract
部分,因为这意味着提前知道 CSV header。
mlr --csv -N --flatsep '/' put '
NR == 1 {
for (i,k in $*) {
$[i] = joinv( apply( splita(k,FLATSEP), func(e) {
return typeof(e) == "int" ? e+1 : e
}), FLATSEP );
}
}
' file.csv |
mlr --icsv --flatsep '/' --ojson cat
Even if there are workarounds like using the rename
verb (when you know the header in advance) or pre-processing the CSV header, I still hope that Miller's author could add an extra command-line option that would deal with this kind of 0‑indexed external data;即使有使用
rename
动词(当你提前知道 header 时)或预处理 CSV header 等变通方法,我仍然希望 Miller 的作者可以添加一个额外的命令行选项来处理这种0‑索引外部数据; adding a DSL
function like arrayify0
(and flatten0
) could also prove useful in some cases.添加像
arrayify0
(和flatten0
)这样的DSL
function 在某些情况下也很有用。
I would like to know if I'm missing something obvious, like a command line option or a way to rename the fields with put verb, or maybe something else?
我想知道我是否遗漏了一些明显的东西,比如命令行选项或用 put 动词重命名字段的方法,或者其他东西?
Starting from this从此开始
email/0,email/1
abc@xyz.org,bob@pass.com
you can use implicit CSV header and run您可以使用隐式 CSV header 并运行
mlr --csv -N put 'if (NR == 1) {for (k in $*) {$[k] = "email/".string(int(regextract($[k],"[0-9]+"))+1)}}' input.csv
to have具有
email/1,email/2
abc@xyz.org,bob@pass.com
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.