简体   繁体   English

来自xml解析的awk语句中的substr

[英]substr in awk statement from xml parse

Link to the original question: bash script extract XML data into column format and now for a modification and explanation -> 链接到原始问题: bash脚本将XML数据提取为列格式 ,现在进行修改和解释 - >

Something within this line of code is not correct and I believe it is with the substr portion and that would be because I don't have a full understanding and would like to learn HOW better to understand it. 这行代码中的某些内容是不正确的,我相信它与substr部分有关,那是因为我没有完全理解,并且想学习如何更好地理解它。 Yes I have looked at documentation and its not fully clicking. 是的,我查看了文档,但没有完全点击。 A couple examples as well as an answer would really be helpful. 几个例子和一个答案真的会有所帮助。

awk -F'[<>]' 'BEGIN{a["STKPR"]="Prod";a["STKSVBLKU"]="Prod";a["STKSVBLOCK"]="Prod";a["STKSVBLK2"]="Test";} /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Test";} /SessionHost/+/Host/{print type, name, $3;}'|sort -u

This bit here: 这一点在这里:

type=a[substr(name,length(name))]; if (length(type)==0) type="Test";

Here is the xml format which each bit is a block for each host that contains the hostname and IP. 这是xml格式,每个位是包含主机名和IP的每个主机的块。

<?xml version="1.0"?>
<Connection>
  <ConnectionType>Putty</ConnectionType>
  <CreatedBy>Someone</CreatedBy>
  <CreationDateTime>2014-10-27T11:53:32.0157492-04:00</CreationDateTime>
  <CredentialConnectionID>9F3C3BCF-068A-4927-B996-CA52154CAE3B</CredentialConnectionID>
  <Description>Red Hat Enterprise Linux 5 (64-bit)</Description>
  <Events>
    <OpenCommentPrompt>true</OpenCommentPrompt>
    <WarnIfAlreadyOpened>true</WarnIfAlreadyOpened>
  </Events>
  <Group>PATH/TO/GROUP/NAME</Group>
  <ID>f2007f03-3b33-47d3-8335-ffd84ccc0e6b</ID>
  <MetaInformation />
  <Name>STKSPRDAPP01111</Name>
  <OpenEmbedded>true</OpenEmbedded>
  <PinEmbeddedMode>False</PinEmbeddedMode>
  <Putty>
    <AlwaysAskForPassword>true</AlwaysAskForPassword>
    <Domain>DOMAIN</Domain>
    <FontSize>12</FontSize>
    <Host>10.0.0.111</Host>
    <Port>22</Port>
    <PortFowardingArray />
    <TelnetEncoding>IBM437</TelnetEncoding>
  </Putty>
  <Stamp>85407098-127d-4d3c-b7fa-8f174cb1e3bd</Stamp>
  <SubMode>2</SubMode>
  <TemplateName>SSH-PerUserCreds</TemplateName>
</Connection>

What I want to do is similar to the referenced link above. 我想要做的是类似于上面引用的链接。 But here I want to match --> 但在这里我要匹配 - >

BEGIN{a["STKPR"]="Prod";a["STKSVBLKU"]="Prod";a["STKSVBLOCK"]="Prod";a["STKSVBLK2"]="Test";

and all of the rest as Test. 所有其余的作为测试。 Best to read the previous post to help make this one more understandable. 最好阅读上一篇文章,以帮助使这一点更容易理解。 Thank you. 谢谢。

Because your keys here are of different length, the substr approach is less than optimal. 因为这里的密钥长度不同,所以substr方法不是最优的。 Try: 尝试:

awk -F'[<>]' '/Name/{n=$3;t="Test"; if(n ~ /^STKPR/) t="Prod"; if (n ~/^STKSVBLKU/) t="Prod"; if (n ~/^STKSVBLOCK/) t="Prod"} /SessionHost/+/Host/{print t, n, $3;}' sample.xml |sort -u
Test STKSPRDAPP01111 10.0.0.111

How It Works 这个怎么运作

In this case, the type, denoted by t , is set according to a series of if statements. 在这种情况下,由t表示的类型是根据一系列if语句设置的。 From the above code, they are: 从上面的代码中,它们是:

t="Test"
if (n ~ /^STKPR/) t="Prod"
if (n ~ /^STKSVBLKU/) t="Prod" 
if (n ~ /^STKSVBLOCK/) t="Prod"

By setting t="Test" , Test becomes the default: the type will be Test unless another statement matches. 通过设置t="Test"Test成为默认值:除非另一个语句匹配,否则类型将为Test If of the following statements looks at the string that begins the host name and, if there is a match, sets type t to a new value. 如果以下语句查看以主机名开头的字符串,并且如果匹配,则将类型t设置为新值。 (When a regular expression begins with ^ , that means that what follows must match at the beginning of the string.) (当正则表达式以^开头时,表示后面的内容必须在字符串的开头匹配。)

Alternative using fancier regular expressions 替代使用发烧友正则表达式

Since the above three if statements are all for the Prod type, the three of them could, if you preferred, be rearranged to: 由于以上三个if语句都是针对Prod类型的,如果您愿意,它们中的三个可以重新排列为:

t="Test"
if (n ~ /^STK(PR|SVBLKU|SVBLOCK)/) t="Prod"

(metalcated: Fixed unmatched parentheses bracket) (金属化:固定不匹配的括号括号)

The substr portion produces a string containing the last character of the string. substr部分生成一个包含字符串最后一个字符的字符串。 This is because it is taking a substring of string name starting at the position length(name) going to the end of the string, and because substr is indexed starting at 1. 这是因为它从位置length(name)开始到字符串末尾的字符串name的子字符串,因为substr从1开始索引。

To match whole strings you can use your variable name rather than processing it with substr . 要匹配整个字符串,您可以使用变量name而不是使用substr处理它。

/Name/ { name=$3; type=a[name]; if (length(type)==0) type="Test"; }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM