Sed command in linux

Question

How do I extract URL's from a file? My file name is URL_name.txt This file has a lot of url inside. It looks like this:

<pre>
<pre><div></pre><something>something here<href="http://www.google.com/">something here</font>
<font><href="http://www.stackoverflow.com/">something</td>

..
..
..
</pre>

Here is my idea, I want to remove everything before URLs then I can remove everything after URL. How do I use sed command deal with it? The output should be

http://www.google.com/

http://www.stackoverflow.com/

Answer 1

使用tr和grep ：

tr '"' '\n' < URL_name.txt | grep http

Answer 2

It is possible using java. as well as you can also try below commands:

egrep -ie "<*HREF=(.*?)>" index.html | cut -d "\\"" -f 2 | grep ://
egrep -ie "<*HREF=(.*?)>" index.html | awk -F\\" '{print $2}' | grep ://

Answer 3

您可以使用grep ：

grep -o 'http://[^"]*' yourfile

Sed command in linux

Question

3 answers

solution1
2 2016-01-20 08:23:21

solution2
0 2016-01-20 08:18:18

solution3
0 ACCPTED 2016-01-20 08:30:04

Sed command in linux

Question

3 answers

solution1 2 2016-01-20 08:23:21

solution2 0 2016-01-20 08:18:18

solution3 0 ACCPTED 2016-01-20 08:30:04

solution1
2 2016-01-20 08:23:21

solution2
0 2016-01-20 08:18:18

solution3
0 ACCPTED 2016-01-20 08:30:04