在bash中处理字符串

Question

I have a file that contains a page of google which I got after a search. 我有一个文件，其中包含经过搜索后得到的google页面。 I used 我用了

w3m -no-cookie $search > google

to make the page 制作页面

after that I need to get all the sites contained in that page, so basically all the strings that start with "www" and end with "/" 之后，我需要获取该页面中包含的所有站点，因此基本上所有以“ www”开头并以“ /”结尾的字符串

I tried : 我试过了：

grep -Fw "www" google | awk -F "/" '{ print $1";" }'

but it gives me everything that is on the line before www 但它为我提供了www之前在线上的所有内容

how do I remove that? 我该如何删除？

should I use sed? 我应该使用sed吗？

thanks! 谢谢！

Answer 1

Assuming that all sites start with www is a bit weird, but here it is: 假设所有站点都以www开头有点奇怪，但是这里是：

Your problem is that grep will return the whole line. 您的问题是grep将返回整行。 With -o it will only return the matched part: 使用-o只会返回匹配的部分：

grep -wo "www.*" google | awk -F "/" '{ print $1";" }'

or simply: 或者简单地：

grep -wo "www[^/]*" google

在bash中处理字符串

问题描述

1 个解决方案

解决方案1
3 2012-08-04 17:19:30

在bash中处理字符串

问题描述

1 个解决方案

解决方案1 3 2012-08-04 17:19:30

解决方案1
3 2012-08-04 17:19:30