Python Beautiful Soup: Target tag only within another

Question

I've been looking for an answer to this question but I'm having little luck. Here is the HTML which I will explain:

<div class="news-overflow-hidden">
    <h3>
        <i class="pholder"></i>
        <a href="/news/view/141524/" style="">ЕСПЧ присудил €15 000 экс-главе службы безопасности ЮКОСа</a> </h3>
    <p class="news-text">
        <a href="/news/view/141524/">В такую сумму Европейский суд по правам человека оценил несоблюдение в отношении мужчины презумпции невиновности и нарушение при исследовании свидетельских показаний в судах.</a> </p>
    <i class="news-type-icon"></i>
</div>

What I want to do is grab the <a> inside of <p class="news-text"> . The problem is that <p class="news-text"> exists in other places so if I grab just that, I will grab things I do not need. How can I target <a> tags that only exist within this type of paragraph? Could I grab all the paragraphs with this class and then make an if statement for each to see if the contents contain <a> or not? Ideas?

Answer 1

You can apply multiple conditions to multiple elements in a single CSS selector :

soup.select("p.news-text a")

This will locate all a elements that are children of the p element that has a news-text class.

Demo:

In [11]: from bs4 import BeautifulSoup

In [12]: data = """<div class="news-overflow-hidden">
    ...:     <h3>
    ...:         <i class="pholder"></i>
    ...:         <a href="/news/view/141524/" style="">ЕСПЧ присудил €15 000 экс-главе службы безопас
    ...: ности ЮКОСа</a> </h3>
    ...:     <p class="news-text">
    ...:         <a href="/news/view/141524/">В такую сумму Европейский суд по правам человека оценил
    ...:  несоблюдение в отношении мужчины презумпции невиновности и нарушение при исследовании свиде
    ...: тельских показаний в судах.</a> </p>
    ...:     <i class="news-type-icon"></i>
    ...: </div>"""

In [13]: soup = BeautifulSoup(data, "html.parser")

In [14]: for a in soup.select("p.news-text a"):
    ...:     print(a.get_text(strip=True))
    ...:     
В такую сумму Европейский суд по правам человека оценил несоблюдение в отношении мужчины презумпции невиновности и нарушение при исследовании свидетельских показаний в судах.

Python Beautiful Soup: Target tag only within another

Question

1 answers

solution1
4 2017-06-06 14:31:50

Python Beautiful Soup: Target tag only within another

Question

1 answers

solution1 4 2017-06-06 14:31:50

solution1
4 2017-06-06 14:31:50