简体   繁体   中英

How to scrape specific text from a webpage in Python using BeautifulSoup?

I want to quickly obtain the day's exchange rate from www.xoom.com

This is what I have so far:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.xoom.com')
data = r.text
soup = BeautifulSoup(data)

Next, after viewing the source code of this webpage I know that exchange rate is mentioned here:

<p class="xcma-fx-rate">Current locked-in exchange rate* <em class="fx-rate">1 USD = 60.1500 INR</em></p>

I tried several things like:

soup.find_all('div class')

But it gives me an empty array: []

How do I scrape the exchange rate?

Try this:

text_rate = soup.find('em',attrs={'class':'fx-rate'}).getText()

Also, using lmxl , assuming that the element is reaaly on the page, you can get rate by this code:

import requests
import lxml.html

r = requests.get('https://www.xoom.com/india/send-money')
data = r.text

tree = lxml.html.fromstring(data)

rate = tree.xpath("//em[@class='fx-rate']")

print rate[0].text_content()

prints 1 USD = 60.1500 INR

First of all, I was fetching the wrong page.

Because I have opened that website a lot of times, when I open it in my browser it automatically shows me an 'india' exchange rate page at the homepage address (after checking some cookie stored in my machine I guess). But this doesn't happen when I'm fetching with Python. So I need to explicitly state the right page to fetch. This is where the rate is mentioned: xoom.com/india/send-money

Now the correct code is:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.xoom.com/india/send-money')
data = r.text
soup = BeautifulSoup(data)

for rate in soup.find_all('em'):
    print rate.text

I tried to use the code provided as I am scraping as well and I think I found the solution for this one:

You can replace the link if you want:

import requests
import lxml.html

r = requests.get('https://www.xoom.com/philippines/send-money')
data = r.text
tree = lxml.html.fromstring(data)
rate = tree.xpath("//div[@class='js-exchange-rate']")

rate[0].text_content()

I am using Python 3.8 and Anaconda

The result was:

' 1 USD = 49.1238 PHP* '

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM