[英]Use of awk/sed to get specific data in rows based on a pattern and arrange them in columns in unix
I have a file as below: 我有一个文件如下:
ID: 1
Name: Admin1
Class: Administrator
Class: Leader
AliasName: User1
AliasedObject: Administrator,Admin1
ID: 2
Name: Admin2
Class: Administrator
Class: Leader
AliasName: User2
AliasedObject: Administrator,Admin2
ID: 3
Name: Admin3
Class: Administrator
Class: Leader
AliasName: User3
AliasedObject: Administrator,Admin3
Now I have to filter only the AliasName and AliasedObject as below: 现在,我只需要过滤AliasName和AliasedObject,如下所示:
AliasName AliasedObject
User1 Administrator,Admin1
User2 Administrator,Admin2
User3 Administrator,Admin3
How can I do this in Unix using the AWK/SED commands? 如何在Unix中使用AWK / SED命令执行此操作?
Whenever you have data that includes name=value pairs, it's a good idea to create a name2value array and access the fields by their names, eg: 每当您的数据包含名称=值对时,最好创建一个name2value数组并按其名称访问字段,例如:
$ cat tst.awk
BEGIN {
RS=""; FS="\n"; OFS="\t"
numNames = split("AliasName AliasedObject",names,/ /)
for (i=1; i<=numNames; i++) {
printf "%s%s", names[i], (i<numNames?OFS:ORS)
}
}
{
delete n2v
for (i=1;i<=NF;i++) {
name = gensub(/:.*/,"","",$i)
value = gensub(/[^:]+:\s*/,"","",$i)
n2v[name] = value
}
for (i=1; i<=numNames; i++) {
printf "%s%s", n2v[names[i]], (i<numNames?OFS:ORS)
}
}
$ awk -f tst.awk file
AliasName AliasedObject
User1 Administrator,Admin1
User2 Administrator,Admin2
User3 Administrator,Admin3
That way if you want to add additional fields to be printed later you just change split("AliasName AliasedObject",names,/ /)
to split("AliasName AliasedObject Class",names,/ /)
or whatever (but having 2 different fields both named "Class" in your data would be an issue you should fix at source if that really exists in your data). 这样,如果您想添加以后要打印的其他字段,只需将split("AliasName AliasedObject",names,/ /)
更改为split("AliasName AliasedObject Class",names,/ /)
或其他任何内容(但具有2个不同的字段)如果数据中确实存在这两个问题,那么这两个问题都应该在源头解决)。
The above uses GNU awk for a couple of extensions ( delete array
, gensub()
, and \\s
), but is easily tweaked to work for any awk if necessary. 上面的代码使用GNU awk进行了几个扩展( delete array
, gensub()
和\\s
),但是如果需要的话,可以很容易地对其进行调整以适用于任何awk。
While the above is the best approach in general, for this particular case if your input file values contain no blanks, I'd just use @fedorqui's concise solution: https://stackoverflow.com/a/29698956/1745001 . 虽然上述方法通常是最好的方法,但在这种情况下,如果您的输入文件值不包含空格,我将只使用@fedorqui的简洁解决方案: https ://stackoverflow.com/a/29698956/1745001。
Supposing the file is exactly like this, you can set the record separator to the paragraph (that is, RS=""
, thanks Ed Morton) and then get the blocks with some data: 假设文件就是这样,您可以将记录分隔符设置为该段(即RS=""
,感谢Ed Morton),然后获取包含一些数据的块:
awk 'BEGIN{RS=""; print "AliasName","AliasedObject"}
{print $10, $12}' file
$ awk 'BEGIN{RS=""; print "AliasName","AliasedObject"} {print $10,$12}' a
AliasName AliasedObject
User1 Administrator,Admin1
User2 Administrator,Admin2
User3 Administrator,Admin3
sed -n '1 i\
AliasName AliasedObject
/^AliasName/ {
s/.*:[[:space:]]*//
N
s/.AliasedObject:[[:space:]]*/ /p
}' YourFile
#!/usr/bin/perl -ln
BEGIN{ $/=''; print "AliasName\tAliasedObject";}
%F = m/(?:^|\n)(\S+):\s*(.*)/g;
print "$F{AliasName} $F{AliasedObject}"
This way, some of the fields can be empty, absent or written in a different order. 这样,某些字段可以为空,不存在或以不同顺序写入。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.