[英]How to use sed or awk regex to parse this data in linux shell
I have this data in my file 我的档案中有这些资料
65 ---
66 FieldType: Text
67 FieldName: STATE
68 FieldNameAlt: STATE
69 FieldFlags: 4194304
70 FieldJustification: Left
71 FieldMaxLength: 2
72 ---
73 FieldType: Text
74 FieldName: ZIP
75 FieldNameAlt: ZIP
76 FieldFlags: 0
77 FieldJustification: Left
78 ---
79 FieldType: Signature
80 FieldName: EMPLOYEE SIGNATURE
81 FieldNameAlt: EMPLOYEE SIGNATURE
82 FieldFlags: 0
83 FieldJustification: Left
84 ---
85 FieldType: Text
86 FieldName: Name_Last
87 FieldNameAlt: LAST
88 FieldFlags: 0
89 FieldValue: Billa
90 FieldJustification: Left
91 ---
How can i make that a array and store as key value pair in array like 我如何制作一个数组并将其作为键值对存储在像这样的数组中
array['fieldtype']
array['fieldName']
for all the objects. 对于所有对象。
i think the separater is only "---" but i don't know how can i do that 我认为分隔符只是“-”,但我不知道该怎么做
Here's one way with GNU awk. 这是使用GNU awk的一种方法。 It splits the input into records which can then be worked on.
它将输入分为记录,然后可以对其进行处理。
parse.awk parse.awk
BEGIN {
RS = " +[0-9]+ +---\n"
FS = "\n"
}
{
for(i=1; i<=NF; i++) { # for each line
sf = split($i, a, ":")
if(sf > 1) { # only accept successfully split lines
sub("^ +[0-9]+ +", "", a[1]) # trim key
sub("^ +", "", a[2]) # trim value
array[a[1]] = a[2] # save into array hash
}
}
}
{
print "Record: " NR
for(k in array) {
print k " -> " array[k]
}
print ""
}
Save the above into parse.awk and run it like this: 将上面的内容保存到parse.awk中 ,然后像这样运行它:
awk -f parse.awk infile
Where infile
contains the data you want to parse. 其中
infile
包含要解析的数据。 Output: 输出:
Record: 1
Record: 2
FieldFlags -> 4194304
FieldNameAlt -> STATE
FieldJustification -> Left
FieldType -> Text
FieldMaxLength -> 2
FieldName -> STATE
Record: 3
FieldFlags -> 0
FieldNameAlt -> ZIP
FieldJustification -> Left
FieldType -> Text
FieldMaxLength -> 2
FieldName -> ZIP
Record: 4
FieldFlags -> 0
FieldNameAlt -> EMPLOYEE SIGNATURE
FieldJustification -> Left
FieldType -> Signature
FieldMaxLength -> 2
FieldName -> EMPLOYEE SIGNATURE
Record: 5
FieldFlags -> 0
FieldNameAlt -> LAST
FieldJustification -> Left
FieldType -> Text
FieldMaxLength -> 2
FieldValue -> Billa
FieldName -> Name_Last
You can use something like this: 您可以使用如下形式:
sed -n '/FieldType/,/FieldName/{N};s/FieldType: \([^\n]*\)\nFieldName: \([^\n]*\)/a["\2"]=\1/gp' input >> tmp.sh
and do: 并做:
source tmp.sh
or use eval
instead of redirection and source
, however the space in the employee signature field will cause problems. 或使用
eval
代替重定向和source
,但是员工签名字段中的空格会引起问题。
Using Perl makes more sense though. 不过,使用Perl更有意义。
In any type of awk: 在任何awk中:
#!awk -F':[[:blank:]]*' -f
BEGIN {
counter = 0
}
/:/ {
array[counter,$1] = $2
}
/---/ {
counter++;
}
END {
# Deal with the array.
}
This creates an array where each cell counted off by 'counter' contains the fields as described above with array[x,key] = value. 这将创建一个数组,其中被“计数器”计数的每个单元格都包含上述array [x,key] = value的字段。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.