简体   繁体   中英

How to get a string before a specific character in bash

I have some content in the file lists.txt as below:

abc.com.                IN A        10.120.51.95    ;10.40.40.57 ;old 10.20.3.57    
;def-mytaxi.com.        IN A        10.12.4.9   ;10.40.3.43 ;test
xyz-mytaxi.com.     IN CNAME        10.12.4.8   ;10.40.3.53 ;test

So, I need to write these to another file so

  1. It should avoid any row starting with ; - 2nd row is avoided

  2. It should only picks the rows with IN A - only the 1st row is seletced

  3. It should remove the training . at the end of every first column - remove . after abc.com. remove . after abc.com.

  4. It should avoid any values in the selected row/s after ; - therefore only prints abc.com 10.120.51.95

and the final output should be written to a file and it should look like;

abc.com 10.120.51.95

So I wrote this script but everything works fine except the 3rd and the 4th steps.

I get the output as:

abc.com.   10.120.51.95   10.40.40.57 old 10.20.3.57  

here's what I tried, can someone help me?

awk '/IN A/ {$2=$3=""; print $0}' lists.txt  | sed '/^;/d;s/;//g;s/#//g' > updated_list.txt

Your conditions are very clear and well defined. So the whole job can be done by an awk script:

awk '/^;/ || (!/IN A/) {next}          # condition 1 and condition 2
     {sub(/IN A/,"",$0);$1=$1;$0=$0}   # condition 5, FS>OFS, recompute fields
     {sub(/[.]$/,"",$1)}               # condition 3
     {sub(/;.*$/,"",$0)}               # condition 4
     {sub(/[.]$/,"",$NF)}              # condition 6 (IP is now in last col)
     { print }' file
  • Condition 1 to 4 are given by the OP.
  • Condition 5 wants to remove IN A
  • Condition 6 remove potential final . from IP

This awk should also work:

awk '!/^[[:blank:]]*;/ && / IN A / {
   sub(/\.$/, "", $1)
   print $1, $4
}' file
abc.com 10.120.51.95

Could you please try following, written and tested with shown samples only in GNU awk .

awk '
!/^;/ && /IN A/{
  sub(/\.+$/,"",$1)
  match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+ +;/)
  val=substr($0,RSTART,RLENGTH)
  sub(/ +;.*/,"",val)
  print $1,val
  val=""
}' Input_file

Explanation: Adding detailed explanation for above.

awk '                                  ##Starting awk program from here.
!/^;/ && /IN A/{                       ##Checking condition if line DO NOT start from ; AND has IN A in it then do following.
  sub(/\.+$/,"",$1)                    ##Substituting all dots trailing in first field with NULL here.
  match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+ +;/)   ##Using match by mentioning regex to match IP address followed by space and followed by semi colon in here.
  val=substr($0,RSTART,RLENGTH)        ##Creating val variable which has sub string of matched regex in above statement.
  sub(/ +;.*/,"",val)                  ##Substituting all spaces and everything from semi colon to last of line in val here.
  print $1,val                         ##Printing first field and val here.
  val=""                               ##Nullifying val here.
}' Input_file                          ##Mentioning Input_file name here.
$ awk 'match($1,/^[^;].*/)&&$2=="IN"&&$3=="A"{print substr($1,RSTART,RLENGTH-1),$4}' file

Output:

abc.com 10.120.51.95

"Explained":

$ awk '
match($1,/^[^;].*/) && $2=="IN" && $3=="A" {     # match 
    print substr($1,RSTART,RLENGTH-1),$4         # and output
}' file

Edit : To remove the trailing . from the ip:

awk '
match($1,/^[^;].*/) && $2=="IN" && $3=="A" {
    print substr($1,RSTART,RLENGTH-1),
        ($4~/\.$/?substr($4,1,length($4)-1):$4)  # remove extra . from ip
}' 

尝试这个:

awk '/^[a-z].*IN A/{sub(/\.$/, "", $1); print $1, $4}' file

替代:

awk '{ gsub(/;.*/,"")} / IN A / { gsub(/\.$/,"",$1); printf("%s %s\n",$1,$4)}' FILE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM