简体   繁体   中英

Python, Beautiful Soup: how to get the desired element

I am trying to arrive to a certain element, parsing a source code of a site. this is a snippet from the part i'm trying to parse (here until Friday), but it is the same for all the days of the week

<div id="intForecast">
    <h2>Forecast for Rome</h2>
    <table cellspacing="0" cellpadding="0" id="nonCA">
        <tr>
            <td onclick="showDetails('1');return false" id="day1" class="on">
                <span>Thursday</span>
                <div class="intIcon"><img src="http://icons.wunderground.com/graphics/conds/2005/sunny.gif" alt="sunny" /></div>
                <div>Clear</div>
                <div><span class="hi">H <span>22</span>&deg;</span> / <span class="lo">L <span>11</span>&deg;</span></div>
            </td>
            <td onclick="showDetails('2');return false" id="day2" class="off">
                <span>Friday</span>
                <div class="intIcon"><img src="http://icons.wunderground.com/graphics/conds/2005/partlycloudy.gif" alt="partlycloudy" /></div>
                <div>Partly Cloudy</div>
                <div><span class="hi">H <span>21</span>&deg;</span> / <span class="lo">L <span>15</span>&deg;</span></div>
            </td>
        </tr>
    </table>
</div>

....and so on for all the days

Actually i got my result but in a ugly way i think:

forecastFriday= soup.find('div',text='Friday').findNext('div').findNext('div').string

now, as you can see i go deep down the elements repeating .findNext('div') and finally arrive at .string

I want to get the information "Partly Cloudy" of Friday

So any more pythonic way to do this? thanks!

Simply find all of the <td> s and iterate over them:

soup = BeautifulSoup(your_html)
div = soup('div',{'id':'intForecast'})[0]
tds = div.find('table').findAll('td')

for td in tds:
    day = td('span')[0].text
    forecast = td('div')[1].text
    print day, forecast

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM