简体   繁体   中英

Extract a particular string from a file and output to another file using grep, awk, sed

I have a file and it contain the ff strings

2013-09-08 21:00:54 SMTP connection from [78.110.75.245]:5387 (TCP/IP connection count = 20)
2013-09-08 21:00:54 SMTP connection from [188.175.142.13]:34332 (TCP/IP connection count = 20)
2013-09-08 21:45:41 SMTP connection from [58.137.11.145]:51984 (TCP/IP connection count = 20)
2013-09-08 21:49:26 SMTP connection from [109.93.248.151]:22273 (TCP/IP connection count = 20)
2013-09-08 21:49:27 SMTP connection from [37.131.64.203]:7906 (TCP/IP connection count = 20)

What I want to do is extract the IP address only and save it to a file.

I started with this

sed '^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$' file > ips

But I couldn't make it work.

Using awk :

awk -F'[][]' '{print $2}' log.file > addresses
78.110.75.245
188.175.142.13
58.137.11.145
109.93.248.151
37.131.64.203

In practice I would probably go with jasonwryan solution but to answer why your sed command doesn't work is because you are using extended regular expression and even perl compliant regular expressions. To use ERE with sed you need to explicitly turn it on using -r with GNU sed or -E with BSD variants. However sed doesn't support PCRE but you can drop the use of non-capturing groups as it doesn't really help here anyway.

As you are just pattern matching grep is probably better then sed :

$ grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' file
78.110.75.245
188.175.142.13
58.137.11.145
109.93.248.151
37.131.64.203  

Notice the anchors also need dropping, that is ^ and $ as the pattern you want to match does not start at the beginning of the string or end at the end. grep also doesn't support extend regular expression by default so -E is used and -o prints only the matching part of the line not the whole line.

The final problem is you have just given sed and regular expression and a file. sed is not grep and won't just print out lines that match (although of course it can, this just isn't how you do it) . An approach would be to use the substitution command s and replace everything before the IP and everything after:

$ sed -r 's/.+[[]([^]]+).+/\1/' file
78.110.75.245
188.175.142.13
58.137.11.145
109.93.248.151
37.131.64.203

Regexplanation:

s    # sed substitute command 
/    # the delimiter marking the start of the regexp
.+   # one or more of any character
[    # start a character class
[    # character class contains a single opening square bracket 
]    # close character class (needed so single [ isn't treated as unclosed)
(    # start capture group
[    # start character class
^]+  # one or more character not an ]
]    # end character class
)    # end capture group 
.+   # one or more of any character
/    # the delimiter marking the end of the regexp and start of replacement
\1   # the first capture group
/    # the delimiter marking the end of the replacement 

Here is a comparison of different regular expression flavours.

您可以使用sed将方括号[]中的内容进行匹配:

sed 's/.*\[\(.*\)\].*/\1/' log.file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM