[英]Convert key:value to CSV file
I found the following bash script for converting a file with key:value information to CSV file:我找到了以下 bash 脚本,用于将带有键值信息的文件转换为 CSV 文件:
awk -F ":" -v OFS="," '
BEGIN { print "category","recommenderSubtype", "resource", "matchesPattern", "resource", "value" }
function printline() {
print data["category"], data["recommenderSubtype"], data["resource"], data["matchesPattern"], data["resource"], data["value"]
}
{data[$1] = $2}
NF == 0 {printline(); delete data}
END {printline()}
' file.yaml
But after executed it, it only converts the first group of data (only the first 6 rows of data), like this但是执行之后,它只转换了第一组数据(只有前6行数据),像这样
category,recommenderSubtype,resource,matchesPattern,resource,value
COST,CHANGE_MACHINE_TYPE,instance-1,f1-micro,instance-1,g1-small
My original file is like this (with 1000 rows and more):我的原始文件是这样的(有 1000 行及更多):
category:COST
recommenderSubtype:CHANGE_MACHINE_TYPE
resource:portal-1
matchesPattern:f1-micro
resource:portal-1
value:g1-small
category:PERFORMANCE
recommenderSubtype:CHANGE_MACHINE_TYPE
resource:old-3
matchesPattern:n1-standard-4
resource:old-3
value:n1-highmem-2
Is there any command am I missing?我缺少任何命令吗?
The problem with the original script are these lines:原始脚本的问题是这些行:
NF == 0 {printline(); delete data}
END {printline()}
The first line means: Call printline() if the current line has no records.第一行表示:如果当前行没有记录,则调用 printline()。 The second line means call
printline()
after all data has been processed.第二行表示在处理完所有数据后调用
printline()
。
The difficulty with the input data format is that it does not really give a good indicator when to output the next record.输入数据格式的难点在于它并不能很好地指示下一条记录何时到 output。 In the following, I have simply changed the script to output the data every six records.
在下文中,我简单地将脚本更改为 output 每六条记录的数据。 In case there can be duplicate keys, the criterion for output might be "all fields populated" or such which would need to be programmed slightly differently.
如果可能有重复的键,output 的标准可能是“所有字段填充”或需要稍微不同的编程。
#!/bin/sh -e
awk -F ":" -v OFS="," '
BEGIN {
records_in = 0
print "category","recommenderSubtype", "resource", "matchesPattern", "resource", "value"
}
{
data[$1] = $2
records_in++
if(records_in == 6) {
records_in = 0;
print data["category"], data["recommenderSubtype"], data["resource"], data["matchesPattern"], data["resource"], data["value"]
}
}
' file.yaml
Other commends其他表扬
delete
statement, because I am unsure what it does.delete
语句,因为我不确定它的作用。 The POSIX specification for awk
only defines it for deleting single array elements. awk
的 POSIX 规范仅将其定义为删除单个数组元素。 In case the whole array should be deleted, it recommends doing a loop over the elements.awk
rather than bash
because AWK is really the scripting language used in this question with bash
only being responsible for calling awk
with suitable parameters:) awk
rather than bash
because AWK is really the scripting language used in this question with bash
only being responsible for calling awk
with suitable parameters:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.