简体   繁体   English

使用awk / sed根据模式在行中获取特定数据,并在UNIX中将它们排列在列中

[英]Use of awk/sed to get specific data in rows based on a pattern and arrange them in columns in unix

I have a file as below: 我有一个文件如下:

ID: 1  
Name: Admin1  
Class: Administrator  
Class: Leader  
AliasName: User1  
AliasedObject: Administrator,Admin1  

ID: 2  
Name: Admin2  
Class: Administrator  
Class: Leader  
AliasName: User2  
AliasedObject: Administrator,Admin2  

ID: 3  
Name: Admin3  
Class: Administrator  
Class: Leader  
AliasName: User3  
AliasedObject: Administrator,Admin3  

Now I have to filter only the AliasName and AliasedObject as below: 现在,我只需要过滤AliasName和AliasedObject,如下所示:

AliasName  AliasedObject  
User1      Administrator,Admin1  
User2      Administrator,Admin2  
User3      Administrator,Admin3  

How can I do this in Unix using the AWK/SED commands? 如何在Unix中使用AWK / SED命令执行此操作?

Whenever you have data that includes name=value pairs, it's a good idea to create a name2value array and access the fields by their names, eg: 每当您的数据包含名称=值对时,最好创建一个name2value数组并按其名称访问字段,例如:

$ cat tst.awk
BEGIN {
    RS=""; FS="\n"; OFS="\t"
    numNames = split("AliasName AliasedObject",names,/ /)
    for (i=1; i<=numNames; i++) {
        printf "%s%s", names[i], (i<numNames?OFS:ORS)
    }
}
{
    delete n2v
    for (i=1;i<=NF;i++) {
        name  = gensub(/:.*/,"","",$i)
        value = gensub(/[^:]+:\s*/,"","",$i)
        n2v[name] = value
    }
    for (i=1; i<=numNames; i++) {
        printf "%s%s", n2v[names[i]], (i<numNames?OFS:ORS)
    }
}
$ awk -f tst.awk file
AliasName       AliasedObject
User1   Administrator,Admin1
User2   Administrator,Admin2
User3   Administrator,Admin3

That way if you want to add additional fields to be printed later you just change split("AliasName AliasedObject",names,/ /) to split("AliasName AliasedObject Class",names,/ /) or whatever (but having 2 different fields both named "Class" in your data would be an issue you should fix at source if that really exists in your data). 这样,如果您想添加以后要打印的其他字段,只需将split("AliasName AliasedObject",names,/ /)更改为split("AliasName AliasedObject Class",names,/ /)或其他任何内容(但具有2个不同的字段)如果数据中确实存在这两个问题,那么这两个问题都应该在源头解决)。

The above uses GNU awk for a couple of extensions ( delete array , gensub() , and \\s ), but is easily tweaked to work for any awk if necessary. 上面的代码使用GNU awk进行了几个扩展( delete arraygensub()\\s ),但是如果需要的话,可以很容易地对其进行调整以适用于任何awk。

While the above is the best approach in general, for this particular case if your input file values contain no blanks, I'd just use @fedorqui's concise solution: https://stackoverflow.com/a/29698956/1745001 . 虽然上述方法通常是最好的方法,但在这种情况下,如果您的输入文件值不包含空格,我将只使用@fedorqui的简洁解决方案: https ://stackoverflow.com/a/29698956/1745001。

Supposing the file is exactly like this, you can set the record separator to the paragraph (that is, RS="" , thanks Ed Morton) and then get the blocks with some data: 假设文件就是这样,您可以将记录分隔符设置为该段(即RS="" ,感谢Ed Morton),然后获取包含一些数据的块:

awk 'BEGIN{RS=""; print "AliasName","AliasedObject"}
     {print $10, $12}' file

Test 测试

$ awk 'BEGIN{RS=""; print "AliasName","AliasedObject"} {print $10,$12}' a
AliasName AliasedObject
User1 Administrator,Admin1
User2 Administrator,Admin2
User3 Administrator,Admin3
sed -n '1 i\
AliasName AliasedObject
/^AliasName/ { 
   s/.*:[[:space:]]*//
   N
   s/.AliasedObject:[[:space:]]*/    /p
   }' YourFile
  • assume file with same record structure 假设文件具有相同的记录结构
  • load expecte field line, reformat and add the next one. 加载预期字段线,重新格式化并添加下一个。 Print only if second pattern modification occur ( little security ) 仅在第二次图案修改发生时打印( 安全性较低
#!/usr/bin/perl -ln

BEGIN{ $/='';  print "AliasName\tAliasedObject";}

%F = m/(?:^|\n)(\S+):\s*(.*)/g;
print "$F{AliasName}   $F{AliasedObject}"

This way, some of the fields can be empty, absent or written in a different order. 这样,某些字段可以为空,不存在或以不同顺序写入。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM