Python3 Beautiful Soup 獲取 HTML 標簽錨點

Question

我正在嘗試使用 BS4 和 Python 來保存和替換 HTML 文件中第一個<translate>標記的內容。

現在我正在嘗試做這樣的事情：

translate_bs4 = bs4_object.find('translate')
translate_key = '{{ key }}'
translate_initial = str(title_bs4)
translate_bs4.string = translate_key

我的測試用例是：

<translate>tag with <other_tag>some text</other_tag></translate>
<much_longer_file>...</much_longer_file>

並且 HTML 是預期的其中之一：

<translate>{{ key }}</translate>
<much_longer_file>...</much_longer_file>

但translate_initial的值是

<translate>tag with <other_tag>some text</other_tag></translate>

而不是預期

tag with <other_tag>some text</other_tag>

我知道使用正則表達式可以輕松提取它，但我想要一些與 DOM 相關的解決方案。

Answer 1

嘗試這個：

translate_bs4 = bs4_object.find('translate')
translate_initial = translate_bs4.decode_contents(formatter="html")