使用grep regex解析文本从文件中的多行文本中提取文本

Question

I have a chunck of text in a file: 我在文件中有一个文本块：

<tr bgcolor="#F9F9F9">
     <td align="left">8/7/2012 11:23:42 AM</td>
     <td align="left"><em>Here is the text I want to parse out</em></td>
     <td class="ra">9.00</td>
     <td class="ra">297.00</td>
     <td class="ra">0.00</td>
     <td class="ra">0.00</td>
     <td class="ra">$0.00</td>
     <td class="ra">$0.50</td>
     <td class="ra"></td>
 </tr>

using grep I would like to end up with the result being 使用grep我想结果是结果

Here is the text I want to parse out 这是我要解析的文本

Working on the code now I have 我现在正在处理代码

cat file.txt | grep -m 1 -oP '<em>[^</em>]*'

but that does not work... thanks for your help! 但这不起作用...感谢您的帮助！

Answer 1

A correct regex would be (?<=<em>).*?(?=</em>) . 正确的正则表达式是(?<=<em>).*?(?=</em>) 。

So, try: 所以，试试：

grep -m 1 -oP '(?<=<em>).*?(?=</em>)' file.txt

使用grep regex解析文本从文件中的多行文本中提取文本

问题描述

1 个解决方案

解决方案1
4 已采纳 2012-08-07 17:17:32

使用grep regex解析文本从文件中的多行文本中提取文本

问题描述

1 个解决方案

解决方案1 4 已采纳 2012-08-07 17:17:32

解决方案1
4 已采纳 2012-08-07 17:17:32