Grep文件中每个单词的开始模式

Question

因此，我有一个文件称为“ page.html”。 在此文件中，有一些我要提取的链接/文件路径。 我一直在BASH中尝试解决这个问题，但似乎做不到。 我想抓住的所有单词/链接/路径都以“ / funny / hello / there /”开头。 目标是将所有这些单词发送到终端，以便我可以使用它们。

到目前为止，这还算是我尝试过的，没有运气：

grep -E '^/funny/hello/there/` page.html

和

grep -Po '/funny/hello/there/.*?` page.html

任何帮助将不胜感激，谢谢。

这是文件中的示例数据：

`<td data-title="Blah"  class="Blah" >
                                                                                                                                        <a href="/funny/hello/there/fkljaskdjfl" title="This here">fdsksldjfah</a>
                                                                                            </td>`

我的输出给了我所有看起来像这样的不同行：

<a href="/funny/hello/there/fkljaskdjfl" title="This here">fdsksldjfah</a>

尽管“ / fkljaskdjfl”有所不同。

我想要的输出看起来像：

/funny/hello/there/fkljaskdjfl
/funny/hello/there/kfjasdflas
/funny/hello/there/kdfhakjasa

Answer 1

您可以使用以下grep命令：

grep -o "/funny/hello/there/[^'\"[:blank:]]*" page.html

但是，应该使用shell程序实用程序来解析HTML，而应使用专用的HTML dom解析器。

Grep文件中每个单词的开始模式

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-11-03 18:47:48

Grep文件中每个单词的开始模式

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-11-03 18:47:48

解决方案1
1 已采纳 2015-11-03 18:47:48