如何在Robot Framework中解析HTML

Question

以下是我的文字，它存儲在${Tooltipdata} ：

    <hr><b><strong>Task Details</strong></b><hr><b>Date Created: </b> 02/21/2014 07:52pm<br> 
<b>Date Modified: </b> 02/24/2014 05:47pm<br><b>Assigned to: </b> Administrator<br>
<b>Created By: </b> Administrator<br><b>Status: </b> Pending Input<br><b>Description:
 </b> test<br>

我想要這樣的結果：

Task Details  Date Created:  02/21/2014 07:52pm    Date Modified:  02/24/2014 05:47pm    Assigned to:  Administrator   
 Created By:  Administrator   
 Status:  Pending Input   
 Description:  test.

很簡單，我想刪除HTML標記。

Answer 1

您可以使用Evaluate關鍵字來運行python re.sub命令。 這樣的事情應該起作用：

*** Keywords ***
| Remove HTML tags
| | [Documentation] | Strip HTML tags from the given string
| | [Arguments]     | ${string}
| | ${result}=      | Evaluate | re.sub(r'<.*?>', '', '''${string}''') | re
| | [Return]        | ${result}

*** Test cases ***
| Example
| | ${Tooltipdata}= | Some keyword which returns the tooltip data
| | ${string}= | Remove HTML tags | ${Tooltipdata}

如果您不熟悉正則表達式，則上述表達式的意思是“匹配<和>'之間的最短字符串，並且re.sub命令將用空字符串替換每次出現的字符串。

如果您的html標記中包含帶有>屬性，則此操作將失敗；如果您的數據同時包含<和>，則它將替換非html標記，但這是您嘗試使用正則表達式解析HTML時要冒的風險。 在您的特定示例中，您應該是安全的。

更好的替代方法是在python中編寫一個關鍵字，然后使用一個真正的HTML解析庫（例如Beautiful Soup）來解析數據。 有關代碼示例，請參見此問題。

Answer 2

您可以嘗試使用正則表達式：

import re

data = "<hr><b><strong>Task Details</strong></b><hr><b>Date Created: </b> 02/21/2014 7:52pm<br><b>Date Modified: </b> 02/24/2014 05:47pm<br><b>Assigned to: </b> Administrator<br><b>Created By: </b> Administrator<br><b>Status: </b> Pending Input<br><b>Description: </b> test<br>"
# get text without tag
result = re.split(r'<[A-z\/]*>', data)

# print with removed tag
print ''.join(result)

Answer 3

通過使用字符串庫，我們可以替換字符串。 這是我用於替換字符串的代碼。

${str} =    Replace String    ${Tooltipdata}    <hr>    a

如何在Robot Framework中解析HTML

問題描述

3 個解決方案

解決方案1
1 2014-05-15 16:31:13

解決方案2
0 2014-02-26 11:07:59

解決方案3
0 2014-02-27 08:01:49

如何在Robot Framework中解析HTML

問題描述

3 個解決方案

解決方案1 1 2014-05-15 16:31:13

解決方案2 0 2014-02-26 11:07:59

解決方案3 0 2014-02-27 08:01:49

解決方案1
1 2014-05-15 16:31:13

解決方案2
0 2014-02-26 11:07:59

解決方案3
0 2014-02-27 08:01:49