简体   繁体   中英

Scraping sector information from Yahoo Finance into Google Sheets using IMPORTXML

I am very new to web-scraping and was introduced to it just today after trying to figure out a formula on a spreadsheet.

I would like to retrieve the Sector information onto Yahoo Finance into Google Sheets. I would also like to the data to update when there is a change to cell B7. Link: https://finance.yahoo.com/quote/MIDD/profile?p=MIDD

I came up with the following, but get a #N/A error: =importxml("https://finance.yahoo.com/quote/",B7,"/profile?p=",B7, "//*[@class='Fw(600) [@data-reactid='21']")

Please let me know what I might be doing wrong. Thank you in advance.

Solution

This is the right syntax to use IMPORTXML formula:

=IMPORTXML("URL", "XPATH_QUERY")

In your case this will translate to:

=importxml("https://finance.yahoo.com/quote/"&B7&"/profile?p="&B7,"//*[@class='Fw(600)'] [@data-reactid='21']")

Which will return an empty result.

Considerations

Keep in mind that many sites go to great lengths to actively prevent scraping. Allowing you to scrape their data entirely, undermines their business model. Since they might make profit from adds for example.

Check in the page you want to scrape if the tags you are watching for correspond to the data you wanted to get in the first place. I believe in this case it's just a matter of changing the tags values to the proper ones.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM