[英]How can I get text from only <p> and <h2> tags when finding element by class with selenium and python?
I am trying to get the text only from the h2 and the first p tag.我试图只从 h2 和第一个 p 标签中获取文本。 I've been using class name to find the div and the output gives me all of the text in the div (obviously).我一直在使用 class 名称来查找 div,而 output 为我提供了 div 中的所有文本(显然)。
Here is the HTML:这是 HTML:
<div class="horoscope-content">
<h2> Today's Libra Horoscope for January 27, 2022 <span class="today-badge">TODAY</span></h2>
<p>Go with the flow, Libra. If you find that a situation isn't unfolding the way you'd like it to, take it as a sign to back off. Swimming upstream is hard work, so use your energy more efficiently by exploring different options. When you step back from a stressful situation, circumstances could turn around. Lighten up by considering other possibilities or talking it through with a helpful friend.</p>
<p>What's in the stars for you tomorrow? <a href="/horoscopes/daily/libra/friday">Read it now</a>.</p>
<div class="dropdown-inline">Read the <b>daily horoscope</b> for another zodiac sign:<div id="dropdown_below_horoscope_dropdown" class="dropdown">
Here is the code I'm using:这是我正在使用的代码:
libra_content = driver.find_elements(By.CLASS_NAME, 'horoscope-content')
I assume the answer is to use xpath but I can't figure out how to include both tags.我假设答案是使用 xpath 但我不知道如何包含这两个标签。 Do I need to use two separate lines of code to do it or can I combine both into one?我需要使用两行单独的代码来执行此操作还是可以将两者合并为一个?
You could use:你可以使用:
For h2:对于 h2:
libra_content = driver.find_element_by_css_selector("div[class='horoscope-content'] > h2 ")
For p:对于 p:
libra_content = driver.find_element_by_css_selector("div[class='horoscope-content'] > p ")
you could use:你可以使用:
libra_content = driver.find_elements(By.xpath, 'your_path')
read this:读这个:
Try This尝试这个
<div>
<h2 class="horoscope-content" >........</h2>
<p class="horoscope-content" >........</p>
<p>.......</p>
libra_content = driver.find_elements(By.CLASS_NAME, 'horoscope-content') Libra_content = driver.find_elements(By.CLASS_NAME, 'horoscope-content')
libra_content = [x.find_element(By.XPATH,'./h2[1]').text for x in driver.find_elements(By.CLASS_NAME, 'horoscope-content')]
You could do something like this instead for both values if you want to store them both.如果您想同时存储这两个值,则可以对这两个值执行类似的操作。
I solved it using css selectors, but didn't combine them into one.我使用 css 选择器解决了它,但没有将它们组合成一个。 Another commenter's answer using xpath and class name combining the two is a possible solution.使用 xpath 和 class 名称结合两者的另一位评论者的回答是一种可能的解决方案。
libra_h2 = driver.find_element(By.CSS_SELECTOR, 'div.horoscope-content > h2')
libra_p = driver.find_element(By.CSS_SELECTOR, 'div.horoscope-content > p')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.