[英]python Parsing xml: get text from tag which contains <i> or <b> or similar
[英]how to get a specific tag text from a similar type of tag in XML in python?
我有以下標簽 -
<PREAMHD>
<HD SOURCE="HED">Applicants:</HD>
<P>Fortune V Separate Account</P>
</PREAMHD>
<PREAMHD>
<HD SOURCE="HED">FILING DATES:</HD>
<P>The application was filed on September 20, 2021</P>
</PREAMHD>
我嘗試了,但是為每個 preamHD 標簽獲取了 P 標簽的所有文本-
if pre.findall("./PREAMHD"):
DATES=''
for dates in pre.findall("./PREAMHD/HD"):
checking_date = dates.text
print(checking_date)
if 'DATES' in checking_date:
print('filing')
for dates_phd in pre.findall("./PREAMHD/P"):
print(dates_phd.text)
for para1 in dates_phd.itertext():
DATES += para1.replace('DATES:', '').replace('\n',' ')
DATES = ' '.join(DATES.split())
print(DATES)
message_body += 'Dated:' + str(DATES)
我怎樣才能只獲得申請日期 P 標簽文本? 任何幫助,將不勝感激。
您可以使用XPath 表達式- 特別是[tag='text']
語法。
選擇具有子命名標記的所有元素,其完整文本內容(包括后代)等於給定文本。
>>> pre.findall('./PREAMHD[HD="FILING DATES:"]/P')
[<Element 'P' at 0x11c239540>]
>>> for p in pre.findall('./PREAMHD[HD="FILING DATES:"]/P'):
... p.text
'The application was filed on September 20, 2021'
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.