簡體   English   中英

如何使用python webbot獲取包含標簽的文本?

[英]How to get text including tags with python webbot?

我在 HTML 文件中有此文本:

<section id="question-a.1" class="level2">
<p>Blabla</p>
<p>if a==b:</p>
<p>c = True</p>
<p>elif a &gt; b+10:</p>
<p>c = True</p>
<p>else:</p>
<p>c = False</p>
</section>

但是當使用 Python webbot 時,我嘗試通過以下方式獲取此元素:

web = Browser()
web.go_to("there")
ques = web.find_elements(id=f"question-a.1")[0].text

我的文本沒問題,但沒有 p 標簽,但我需要它們。 有沒有辦法在部分標簽中獲取整個文本,包括 p 標簽(或任何其他標簽,如數學等)?

謝謝

例如,您可以使用屬性.get_attribute('outerHTML')來檢索所選元素的實際 html

from webbot import Browser
web = Browser()
web.go_to('google.com')
web.find_elements(id="jhp big")[0].get_attribute('outerHTML')

會屈服

'<a class="gb_Ld gb_od" role="button" tabindex="0" style="color:#ffffff;background-color:#4285F4">مراجعة</a>'

或者你可以使用innerHTML來獲取沒有包裝標簽的內部html

web.find_elements(id="fbar")[0].get_attribute('innerHTML') 

會屈服

<div class="fbar"><div class="b2hzT"><style data-iml="1585133904756">.b0KoTc{color:rgba(0,0,0,.54);padding-right:27px}.Q8LRLc{font-size:15px}.b0KoTc{margin-right:30px;text-align:right}.b2hzT{border-bottom:1px solid #e4e4e4}</style><div class="b0KoTc"><span class="Q8LRLc">مصر</span></div></div><span id="fsr"><a class="Fx4vi" href="https://policies.google.com/privacy?fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://policies.google.com/privacy%3Ffg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQ8awCCBE">الخصوصية</a><a class="Fx4vi" href="https://policies.google.com/terms?fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://policies.google.com/terms%3Ffg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQ8qwCCBI">البنود</a><span style="display:inline-block;position:relative"><a class="Fx4vi" href="https://www.google.com/preferences?hl=ar" id="fsettl" aria-controls="fsett" aria-expanded="false" aria-haspopup="true" role="button" jsaction="foot.cst" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/preferences%3Fhl%3Dar&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQzq0CCBM">الإعدادات</a><span id="fsett" aria-labelledby="fsettl" role="menu" style="display:none"><a href="https://www.google.com/preferences?hl=ar&amp;fg=1" role="menuitem">إعدادات البحث</a><a href="/advanced_search?hl=ar&amp;fg=1" role="menuitem">بحث متقدم</a><a href="/history/privacyadvisor/search/unauth?utm_source=googlemenu&amp;fg=1" role="menuitem">بياناتك في خدمة "بحث"</a><a href="/history/optout?hl=ar&amp;fg=1" role="menuitem">السجلّ</a><a href="//support.google.com/websearch/?p=ws_results_help&amp;hl=ar&amp;fg=1" role="menuitem">مساعدة البحث</a><a href="#" data-bucket="websearch" role="menuitem" id="dk2qOd" target="_blank" jsaction="gf.sf" data-ved="0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQLggU">إرسال تعليقات</a></span></span></span><span id="fsl"><a class="Fx4vi" href="https://www.google.com/intl/ar_eg/ads/?subid=ww-ww-et-g-awa-a-g_hpafoot1_1!o2&amp;utm_source=google.com&amp;utm_medium=referral&amp;utm_campaign=google_hpafooter&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/intl/ar_eg/ads/%3Fsubid%3Dww-ww-et-g-awa-a-g_hpafoot1_1!o2%26utm_source%3Dgoogle.com%26utm_medium%3Dreferral%26utm_campaign%3Dgoogle_hpafooter%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQkdQCCBU">الإعلانات</a><a class="Fx4vi" href="https://www.google.com/services/?subid=ww-ww-et-g-awa-a-g_hpbfoot1_1!o2&amp;utm_source=google.com&amp;utm_medium=referral&amp;utm_campaign=google_hpbfooter&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/services/%3Fsubid%3Dww-ww-et-g-awa-a-g_hpbfoot1_1!o2%26utm_source%3Dgoogle.com%26utm_medium%3Dreferral%26utm_campaign%3Dgoogle_hpbfooter%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQktQCCBY">الأعمال</a><a class="Fx4vi" href="https://about.google/?utm_source=google-EG&amp;utm_medium=referral&amp;utm_campaign=hp-footer&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://about.google/%3Futm_source%3Dgoogle-EG%26utm_medium%3Dreferral%26utm_campaign%3Dhp-footer%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQkNQCCBc">حول</a><a class="Fx4vi" href="//google.com/search/howsearchworks/?fg=1">  آلية عمل "بحث Google" </a></span></div>

outerHTML相比將 yield

<div class="EvHmz hRvfYe" id="fbar"><div class="fbar"><div class="b2hzT"><style data-iml="1585133904756">.b0KoTc{color:rgba(0,0,0,.54);padding-right:27px}.Q8LRLc{font-size:15px}.b0KoTc{margin-right:30px;text-align:right}.b2hzT{border-bottom:1px solid #e4e4e4}</style><div class="b0KoTc"><span class="Q8LRLc">مصر</span></div></div><span id="fsr"><a class="Fx4vi" href="https://policies.google.com/privacy?fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://policies.google.com/privacy%3Ffg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQ8awCCBE">الخصوصية</a><a class="Fx4vi" href="https://policies.google.com/terms?fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://policies.google.com/terms%3Ffg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQ8qwCCBI">البنود</a><span style="display:inline-block;position:relative"><a class="Fx4vi" href="https://www.google.com/preferences?hl=ar" id="fsettl" aria-controls="fsett" aria-expanded="false" aria-haspopup="true" role="button" jsaction="foot.cst" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/preferences%3Fhl%3Dar&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQzq0CCBM">الإعدادات</a><span id="fsett" aria-labelledby="fsettl" role="menu" style="display:none"><a href="https://www.google.com/preferences?hl=ar&amp;fg=1" role="menuitem">إعدادات البحث</a><a href="/advanced_search?hl=ar&amp;fg=1" role="menuitem">بحث متقدم</a><a href="/history/privacyadvisor/search/unauth?utm_source=googlemenu&amp;fg=1" role="menuitem">بياناتك في خدمة "بحث"</a><a href="/history/optout?hl=ar&amp;fg=1" role="menuitem">السجلّ</a><a href="//support.google.com/websearch/?p=ws_results_help&amp;hl=ar&amp;fg=1" role="menuitem">مساعدة البحث</a><a href="#" data-bucket="websearch" role="menuitem" id="dk2qOd" target="_blank" jsaction="gf.sf" data-ved="0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQLggU">إرسال تعليقات</a></span></span></span><span id="fsl"><a class="Fx4vi" href="https://www.google.com/intl/ar_eg/ads/?subid=ww-ww-et-g-awa-a-g_hpafoot1_1!o2&amp;utm_source=google.com&amp;utm_medium=referral&amp;utm_campaign=google_hpafooter&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/intl/ar_eg/ads/%3Fsubid%3Dww-ww-et-g-awa-a-g_hpafoot1_1!o2%26utm_source%3Dgoogle.com%26utm_medium%3Dreferral%26utm_campaign%3Dgoogle_hpafooter%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQkdQCCBU">الإعلانات</a><a class="Fx4vi" href="https://www.google.com/services/?subid=ww-ww-et-g-awa-a-g_hpbfoot1_1!o2&amp;utm_source=google.com&amp;utm_medium=referral&amp;utm_campaign=google_hpbfooter&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://www.google.com/services/%3Fsubid%3Dww-ww-et-g-awa-a-g_hpbfoot1_1!o2%26utm_source%3Dgoogle.com%26utm_medium%3Dreferral%26utm_campaign%3Dgoogle_hpbfooter%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQktQCCBY">الأعمال</a><a class="Fx4vi" href="https://about.google/?utm_source=google-EG&amp;utm_medium=referral&amp;utm_campaign=hp-footer&amp;fg=1" ping="/url?sa=t&amp;rct=j&amp;source=webhp&amp;url=https://about.google/%3Futm_source%3Dgoogle-EG%26utm_medium%3Dreferral%26utm_campaign%3Dhp-footer%26fg%3D1&amp;ved=0ahUKEwiOvK76u7XoAhVEEncKHf5pA1YQkNQCCBc">حول</a><a class="Fx4vi" href="//google.com/search/howsearchworks/?fg=1">  آلية عمل "بحث Google" </a></span></div></div>

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM