How to search for the 1st occurrence of a pattern in a line using EGREP

Question

I am using EGREP regex to search for some patterns in a file that contains URLs. I want to find the first instance only in each line. For example, this is my regex:

egrep -io '^\<http(s)://home\>+\..+\.gov(\.au)?' input.txt

It output this instance:

https://home.xxx.gov/uuu.aspx?url=https://home.xxx.gov

But what I really look for in this specific example is:

https://home.xxx.gov

I do not care what comes after the .gov and I want to trim it. How can I do this?

Answer 1

You'll need a lazy quantifier , and for that you need Perl-style regexes:

egrep -P -io '^https?://home\..+?\.gov(\.au|\.uk)?' input.txt

If your egrep doesn't support Perl regexes, you need to find a different way, for example

egrep -io '^https?://home\.[A-Za-z0-9.]+\.gov(\.au|\.uk)?' input.txt

or

egrep -io '^https?://home\.[^/]+\.gov(\.au|\.uk)?' input.txt

limiting the range of characters that may be matched by the regex. See also @sshashank124's solution.

Answer 2

你可以这样做：

^\\<https?://home\\.\\w+\\.gov(\\.au|\\.uk)?

How to search for the 1st occurrence of a pattern in a line using EGREP

Question

2 answers

solution1
2 ACCPTED 2014-04-25 08:47:52

solution2
1 2014-04-25 08:47:04

How to search for the 1st occurrence of a pattern in a line using EGREP

Question

2 answers

solution1 2 ACCPTED 2014-04-25 08:47:52

solution2 1 2014-04-25 08:47:04

solution1
2 ACCPTED 2014-04-25 08:47:52

solution2
1 2014-04-25 08:47:04