untagged text extraction with python is not working

Question

I want to extract 1626 from the tag below using python and beautiful soup I have tried this answer Accessing untagged text using beautifulsoup but all I get back is an empty array []

<div class="columns">
<h1 style="line-height: .85em; margin-top: 0" class="panel-border text-primary strong">
            Laundry Dry Cleaning Equipment
            <br>

            <br>
</h1>

        1626 Total Items
<!-- br-->
<div>...</div>
</div>

how can I extract the number ?

Answer 1

You can loop through the html code and find what you need using regex

import bs4, re

page = """
<div class="columns">
<h1 style="line-height: .85em; margin-top: 0" class="panel-border text-primary strong">
            Laundry Dry Cleaning Equipment
            <br>

            <br>
</h1>

        1626 Total Items
    5526 Total Items
                    4426 Total Items
<!-- br-->
<div>...</div>
</div>"""

soup = bs4.BeautifulSoup(page, 'lxml')

divs = soup.findAll('div', {'class' : 'columns'})
div= divs[0]    # we only have one div

divtext= str(div).split('\n')   # get div html code and split it's lines
for line in divtext:
    line = line.strip()

    # match wanted pattern
    match = re.match(r'^(\d+)\s*Total Items$', line)

    if match is not None:     #if match found
        print(match.group(1)) # extract the number

Answer 2

I tried to use the same conventions used in this link you attached to your question above.

Hopefully this is what you are looking for.

Code:

data = """
<div class="columns">
<h1 style="line-height: .85em; margin-top: 0" class="panel-border text-primary strong">
            Laundry Dry Cleaning Equipment
            <br>

            <br>
</h1>

        1626 Total Items
<!-- br-->
<div>...</div>
</div>
"""
soup = BeautifulSoup(data, 'html.parser')
for i in soup.find_all(text=True, recursive=True):
    if "Total Items" in i:
       print(str(i).replace(' ', '').replace('TotalItems', ''))

Output:

untagged text extraction with python is not working

Question

2 answers

solution1
0 2017-09-21 04:23:35

solution2
0 2017-09-21 06:51:52

untagged text extraction with python is not working

Question

2 answers

solution1 0 2017-09-21 04:23:35

solution2 0 2017-09-21 06:51:52

solution1
0 2017-09-21 04:23:35

solution2
0 2017-09-21 06:51:52