简体   繁体   English

在Bash中使用数组和Sed和Awk解析文本文件

[英]Parsing Text File using array and Sed and Awk in Bash

I have what appears to me a complex text file that include around 300 entries. 我有一个复杂的文本文件,包括大约300个条目。 I have no idea how to go about parsing this file to get the output I want. 我不知道如何解析这个文件来获得我想要的输出。 Each of my network users has an entry in the file. 我的每个网络用户都在该文件中有一个条目。 So in the text file, each user name starts with: 因此,在文本文件中,每个用户名都以:

USER:martha
USER:Othello
USER:darwin

Underneath each user entry in the file there are a host of information I require , but one user can have one entry and another can have multiple entries. 在文件中的每个用户条目下面都有我需要的大量信息,但是一个用户可以有一个条目而另一个用户可以有多个条目。 Here is the example of 3 such entries 以下是3个此类条目的示例

USER:martha
    POSITION: 170.198.82.13 [VLT(304394),PT(FULL)]
            CLIENT: jcrm19.1.p2ps -258-
            ACCESSPOINT: 170.198.82.13/net
            APPLICATION: 91

USER:othello 
    POSITION: 170.198.80.212 [VLT(307571),PT(FULL)]
            CLIENT: jcrm15.1.p2ps -258-
            ACCESSPOINT: 170.198.80.212/net
            APPLICATION: 256

            CLIENT: jcrm15.1.p2ps -258-
            ACCESSPOINT: 170.198.80.212/net
            APPLICATION: 256

    POSITION: 170.198.80.209 [VLT(306561),PT(FULL)]
            CLIENT: jcrm14.1.p2ps -258-
            ACCESSPOINT: 170.198.80.209/net
            APPLICATION: 256

            CLIENT: pwrm14.1.p2ps -258-
            ACCESSPOINT: 170.198.80.209/net
            APPLICATION: 256

            CLIENT: pwrm14.1.p2ps -258-
            ACCESSPOINT: 170.198.80.209/net
            APPLICATION: 256


USER:darwin
    POSITION: 170.198.19.102 [VLT(297987),PT(FULL)]
            CLIENT: jcrm16.1.p2ps -258-
            ACCESSPOINT: 170.198.19.102/net
            APPLICATION: 91

The final output should look as follow: 最终输出应如下所示:

USER        Position           Client     Application 

Martha      170.198.82.13       jcrm19      91
Othello     170.198.80.212      jcrm15      256
Othello     170.198.80.209      jcrm14      256
Martin      170.198.19.102      jcrm16      91

I have some experience with arrays and I could grep out some of the information and assign to variable and print them. 我有一些数组的经验,我可以grep一些信息并分配给变量并打印它们。 But I just don't know how to read the information into the arrays as the entries under each "USER" since they are of different length and content. 但我只是不知道如何将信息作为每个“USER”下的条目读入数组,因为它们具有不同的长度和内容。

So How do I read USER: martha and then jump to user:othello ? 那么如何阅读USER:martha然后跳转到用户:othello Also, under user:othello there are two "Positions" that I need to grab. 此外,在用户:othello下,我需要抓住两个“位置” I just don't know how to put the content I'm looking for into array variables or regular variables. 我只是不知道如何将我正在寻找的内容放入数组变量或常规变量中。 I never had to parse a file that had different length and content data for each use. 我从来没有必要为每次使用解析具有不同长度和内容数据的文件。 Not sure how many lines I have to read before I start reading and assigning values to array or values for the next user> Can you provide some hints or perhaps a piece of code that I can start with ? 在我开始阅读并为下一个用户分配数组或值之前,我不确定需要阅读多少行>你能提供一些提示或者我可以开始的一段代码吗?

Thanks 谢谢

Using awk with column : 使用带column awk

awk -F '[: ]+' 'BEGIN{print "USER", "Position", "Client", "Application"} 
  $1=="USER"{u=$2} $2=="POSITION"{p=$3}$2=="CLIENT"{c=$3}
  $2=="APPLICATION"&&p{print u, p, c, $3; p=""}' file | column -t

USER     Position        Client         Application
martha   170.198.82.13   jcrm19.1.p2ps  91
othello  170.198.80.212  jcrm15.1.p2ps  256
othello  170.198.80.209  jcrm14.1.p2ps  256
darwin   170.198.19.102  jcrm16.1.p2ps  91

我没有拿到我的Mac,所以这是未经测试的......

awk -F: '/^USER:/{u=$2} /POSITION:/{p=$2} /CLIENT:/{c=$2} /APPLICATION:/{print u,p,c,$2}' yourfile
awk -v RS="" -F'[:\n ]*' '/^USER/{u=$2}
 /POSI/{p=/^USER/?$4:$3
 for(i=1;i<=NF;i++)
     if($i=="CLIENT"){sub(/\..*/,"",$(i+1))
                      print u,p,$(i+1),$NF;break}}' file

the output without header: 没有标题的输出:

martha 170.198.82.13 jcrm19 91
othello 170.198.80.212 jcrm15 256
othello 170.198.80.209 jcrm14 256
darwin 170.198.19.102 jcrm16 91

you could add header and pipe to column -t to gain better format 您可以将标题和管道添加到column -t以获得更好的格式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM