如何使用 GoogleScraper 包在 Python 中抓取来自不同搜索引擎的链接

Question

我想在 python 中为我的搜索查询从不同的搜索引擎中抓取链接。

例如

查询：-“谁是 Sachin Tendulkar”

输出：想要来自 google 搜索、bing 搜索的链接。

在挖掘了许多链接后，我发现了 google scraper packege 。

谷歌爬虫链接

https://pypi.python.org/pypi/GoogleScraper/0.1.37

但我没有发现这个包有什么好运气。 任何人都可以帮助我使用 GoogleScraper 或任何替代抓取链接的方法

Answer 1

嘿，您可以通过您提到的 GoogleScraper 相同的包来实现它。 通过链接https://github.com/NikolaiT/GoogleScraper

另外以下是python代码

from GoogleScraper import scrape_with_config, GoogleSearchError
def saveLink(self, query):
        # See in the config.cfg file for possible values
        try:
            if query:
                file_name = query.replace(" " , "_")
                self.config = {
                    'SCRAPING': {
                        'use_own_ip': 'True',
                        'keyword': query,
                        'search_engines': 'bing',
                        'num_pages_for_keyword': 1,
                        'scrape_method': 'http'
                    },
                    'SELENIUM': {
                        'sel_browser': 'chrome',
                    },
                    'OUTPUT': {
                        'output_filename': "path/" + file_name + ".json"
                    },
                    'GLOBAL': {
                        'do_caching': 'False'
                    }
                }
                
                raw_html = ""
                sqlalchemy_session = scrape_with_config(self.config)
        except Exception:
            import traceback
            print(traceback.format_exc())

如果您想为多个搜索引擎重新使用，您可以添加

'search_engines': 'bing, yahoo, google',

您将在文件output_filename 中获得 json

如何使用 GoogleScraper 包在 Python 中抓取来自不同搜索引擎的链接

问题描述

1 个解决方案

解决方案1
1 已采纳

如何使用 GoogleScraper 包在 Python 中抓取来自不同搜索引擎的链接

问题描述

1 个解决方案

解决方案1 1 已采纳

解决方案1
1 已采纳