Removing extra characters from python string

Question

Information I want to extract: The Locations Al Bayan and Nepal in a list ['Al Bayan' , 'Nepal']

<div class="location">
<div class="listing-location">Location</div>
<div class="location-areas">
<span class="location">Al Bayan</span>
‪,‪
<span class="location">Nepal</span>
</div>
<div class="area-description"> 3.3 km from Mall of the Emirates </div>
</div>

Code to extract the area:

Area

try:
    area= soup.find('div', 'location-areas')
    area_result= str(area.get_text().strip().encode("utf-8"))
    print([area_result])


except StandardError as e:
    area_result="Error was {0}".format(e)
    print area_result

Output:

"Al Bayanأ¢â‚¬آھ,أ¢â‚¬آھ

                            Nepal"

Desired Output:

['Al Bayan', 'Nepal']

Answer 1

I'd assume soup is a BeautifulSoup instance, as such soup = BeautifulSoup(html_string, "html.parser") where html_string is your html markup .

Try this out:

area_list = [area.get_text().strip().encode('utf-8') for area in soup.find_all('span', {'class': 'location'})] print area_list

Removing extra characters from python string

Question

Area

1 answers

solution1
0 2016-05-30 18:07:53

Removing extra characters from python string

Question

Area

1 answers

solution1 0 2016-05-30 18:07:53

solution1
0 2016-05-30 18:07:53