简体   繁体   English

awk分组和拆分多个定界符

[英]awk grouping and spliting multiple delimiters

I'm trying to split a line on multiple delimiters and group the output into individual elements that I can reorder. 我正在尝试在多个定界符上分割一行,并将输出分组为可以重新排序的单个元素。 I'm on a BSD system running the pkg_info command. 我在运行pkg_info命令的BSD系统上。 The output looks like this. 输出看起来像这样。

yaesu-0.13nb1       Control interface for Yaesu FT-890 HF transceiver   
skk-jisyo-cdb-201212 Dictionary collection for SKK  
dbskkd-cdb-2.00nb1  SKK dictionary server based on cdb
libchewing-0.2.7    The intelligent phonetic input method library  
skk-jisyo-201212    Dictionary collection for SKK  
autoconf-2.69nb2    Generates automatic source code configuration scripts  
pkg-config-0.28     System for managing library compile/link flags 
python27-2.7.5      Interpreted, interactive, object-oriented programming language

The package name always contains letters and numbers the version is the last entry attached to the name and the description is always separated by at least one white space. 软件包名称始终包含字母和数字,版本号是该名称所附的最后一个条目,并且说明始终至少用一个空格分隔。 The most complex example is, "skk-jisyo-cdb" is the package name. 最复杂的示例是,“ skk-jisyo-cdb”是程序包名称。 "201212" is the version and "Dictionary collection for SKK" is the description. 版本为“ 201212”,说明为“ SKK词典集合”。

I need to separate the version from the package name, leaving the package name intact with the "-" left in it while splitting the version info from there and making that an element of its own. 我需要将版本与软件包名称分开,保留软件包名称的原样,并在其中留下“-”,同时从此处拆分版本信息,并使其成为一个独立的元素。 Lastly I need to have the description remain in tact as a third element. 最后,我需要使描述保持原样,作为第三个要素。

I think either awk or sed is capable of doing this but have yet to be able to group the elements correctly. 我认为awk或sed都可以做到这一点,但还不能正确地对元素进行分组。 Any help is much appreciated! 任何帮助深表感谢!

Here are a few of the thing I have tried so far: 到目前为止,我尝试了一些操作:

pkg_info -a | awk -F'[[:space:]]*' '{print $1}' | awk -F- '{$NF=" "$NF;sub(/ /,"-")}1'

output: 输出:

yaesu- 0.13nb1
skk-jisyo cdb  201212
dbskkd-cdb  2.00nb1
libchewing- 0.2.7
skk-jisyo  201212
autoconf- 2.69nb2
pkg-config  0.28
python27- 2.7.5

And

pkg_info -a | awk 'BEGIN{FS="-| ";OFS="\t"}{print $1$2}'

output: 输出:

yaesu0.13nb1
skkjisyo
dbskkdcdb
libchewing0.2.7
skkjisyo
autoconf2.69nb2
pkgconfig
python272.7.5

I have been able to separate out the package name and version using 2 commands but this is not what I want/need. 我已经能够使用2个命令分离出软件包名称和版本,但这不是我想要/需要的。 These are just for reference. 这些仅供参考。 This will get me the version by itself: 这将使我自己获得版本:

pkg_info -a | awk -F'[[:space:]]*' '{print $1}' | awk -F- '{print $NF }'

This will get me the package name by itself: 这将使我自己获得软件包名称:

pkg_info -a | awk -F'[[:space:]]*' '{print $1}' | sed 's/\(.*\)\(-.*\)/\1/g'

What I need my final output to be is $pkgname\\t$version\\t$description\\n this would be seperated by a \\t Tab With the most complex example the output would be: skk-jisyo-cdb\\t201212\\tDictionary collection for SKK\\n 我需要的最终输出是$pkgname\\t$version\\t$description\\n这将由一个\\t制表符分隔。最复杂的示例输出为: skk-jisyo-cdb\\t201212\\tDictionary collection for SKK\\n

You didn't provide enough detail to be sure but this MAY be what you want: 您没有提供足够的细节来确定,但这可能就是您想要的:

$ sed -r 's/([^[:blank:]]+)-([^[:blank:]]+)[[:blank:]]+/\1\t\2\t/' file
yaesu   0.13nb1 Control interface for Yaesu FT-890 HF transceiver   
skk-jisyo-cdb   201212  Dictionary collection for SKK  
dbskkd-cdb      2.00nb1 SKK dictionary server based on cdb
libchewing      0.2.7   The intelligent phonetic input method library  
skk-jisyo       201212  Dictionary collection for SKK  
autoconf        2.69nb2 Generates automatic source code configuration scripts  
pkg-config      0.28    System for managing library compile/link flags 
python27        2.7.5   Interpreted, interactive, object-oriented programming language

.

$ awk -v OFS='\t' '{ pkg=ver=$1; sub(/-[^-]+$/,"",pkg); sub(/.*-/,"",ver); sub(/[^[:space:]]+[[:space:]]+/,""); print pkg, ver, $0}' file  
yaesu   0.13nb1 Control interface for Yaesu FT-890 HF transceiver   
skk-jisyo-cdb   201212  Dictionary collection for SKK  
dbskkd-cdb      2.00nb1 SKK dictionary server based on cdb
libchewing      0.2.7   The intelligent phonetic input method library  
skk-jisyo       201212  Dictionary collection for SKK  
autoconf        2.69nb2 Generates automatic source code configuration scripts  
pkg-config      0.28    System for managing library compile/link flags 
python27        2.7.5   Interpreted, interactive, object-oriented programming language

Change the separator from tab to whatever it is you want. 将分隔符从tab更改为所需的任何内容。

您可以在字段1上使用默认字段分隔符和split函数。然后,只需将字段分隔符和split的最后一项附加到第一个字段:

awk '{n=split($1, a, "-"); $1=$1 FS a[n]}1'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM