简体   繁体   English

将 key:value 转换为 CSV 文件

[英]Convert key:value to CSV file

I found the following bash script for converting a file with key:value information to CSV file:我找到了以下 bash 脚本,用于将带有键值信息的文件转换为 CSV 文件:

awk -F ":" -v OFS="," '
BEGIN { print "category","recommenderSubtype", "resource", "matchesPattern", "resource", "value" }
function printline() {
print data["category"], data["recommenderSubtype"], data["resource"], data["matchesPattern"], data["resource"], data["value"]
}
{data[$1] = $2}
NF == 0 {printline(); delete data}
END {printline()}
' file.yaml

But after executed it, it only converts the first group of data (only the first 6 rows of data), like this但是执行之后,它只转换了第一组数据(只有前6行数据),像这样

category,recommenderSubtype,resource,matchesPattern,resource,value
COST,CHANGE_MACHINE_TYPE,instance-1,f1-micro,instance-1,g1-small

My original file is like this (with 1000 rows and more):我的原始文件是这样的(有 1000 行及更多):

category:COST
recommenderSubtype:CHANGE_MACHINE_TYPE
resource:portal-1
matchesPattern:f1-micro
resource:portal-1
value:g1-small
category:PERFORMANCE
recommenderSubtype:CHANGE_MACHINE_TYPE
resource:old-3
matchesPattern:n1-standard-4
resource:old-3
value:n1-highmem-2

Is there any command am I missing?我缺少任何命令吗?

The problem with the original script are these lines:原始脚本的问题是这些行:

NF == 0 {printline(); delete data}
END {printline()}

The first line means: Call printline() if the current line has no records.第一行表示:如果当前行没有记录,则调用 printline()。 The second line means call printline() after all data has been processed.第二行表示在处理完所有数据后调用printline()

The difficulty with the input data format is that it does not really give a good indicator when to output the next record.输入数据格式的难点在于它并不能很好地指示下一条记录何时到 output。 In the following, I have simply changed the script to output the data every six records.在下文中,我简单地将脚本更改为 output 每六条记录的数据。 In case there can be duplicate keys, the criterion for output might be "all fields populated" or such which would need to be programmed slightly differently.如果可能有重复的键,output 的标准可能是“所有字段填充”或需要稍微不同的编程。

#!/bin/sh -e
awk -F ":" -v OFS="," '
BEGIN {
    records_in = 0
    print "category","recommenderSubtype", "resource", "matchesPattern", "resource", "value"
}
{
    data[$1] = $2
    records_in++
    if(records_in == 6) {
        records_in = 0;
        print data["category"], data["recommenderSubtype"], data["resource"], data["matchesPattern"], data["resource"], data["value"]
    }
}
' file.yaml

Other commends其他表扬

  • I have just removed the delete statement, because I am unsure what it does.我刚刚删除了delete语句,因为我不确定它的作用。 The POSIX specification for awk only defines it for deleting single array elements. awk的 POSIX 规范仅将其定义为删除单个数组元素。 In case the whole array should be deleted, it recommends doing a loop over the elements.如果应该删除整个数组,建议对元素进行循环。 In case all fields are always present, however, it might as well be possible to eliminate it altogether.但是,如果所有字段始终存在,那么也可以完全消除它。
  • Welcome to SO (I am new here as well).欢迎来到 SO(我也是新来的)。 Next time you are asking, I would recommend tagging the question awk rather than bash because AWK is really the scripting language used in this question with bash only being responsible for calling awk with suitable parameters:) Next time you are asking, I would recommend tagging the question awk rather than bash because AWK is really the scripting language used in this question with bash only being responsible for calling awk with suitable parameters:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM