[英]How to query with time filters in GoogleScraper?
Even if Google's official API does not offer time information in the query results - even no time filtering for keywords, there is time filtering option in the advanced search: 即使Google的官方API不在查询结果中提供时间信息-即使没有针对关键字的时间过滤,高级搜索中也有时间过滤选项:
Google results for stackoverflow in the last one hour Google在过去一小时内发现了stackoverflow
GoogleScraper library offers many flexible options BUT time related ones. GoogleScraper库提供了许多与时间相关的灵活选项。 How to add time features using the library?
如何使用库添加时间功能?
After a bit of inspection, I've found that time Google sends the filtering information by qdr
value to the tbs
key (possibly means time based search
although not officially stated): 经过一番检查后,我发现Google将
qdr
值的过滤信息发送到tbs
键的time based search
(可能是time based search
尽管未正式说明):
https://www.google.com/search?tbs=qdr:h1&q=stackoverflow https://www.google.com/search?tbs=qdr:h1&q=stackoverflow
This gets the results for the past hour. 这将获取过去一个小时的结果。
m
and y
letters can be used for months and years respectively. m
和y
字母分别可以使用几个月和几年。
Also, to add sorting by date feature, add the sbd
(should mean sort by date
) value as well: https://www.google.com/search?tbs=qdr:h1,sbd:1&q=stackoverflow 另外,要添加按日期排序功能,请同时添加
sbd
(应表示sort by date
)值: https : //www.google.com/search? sbd
, sbd
:1 & sbd
I was able to insert these keywords to the BASE Google URL of GoogleScraper. 我能够将这些关键字插入GoogleScraper的BASE Google URL。 Insert below lines to the end of
get_base_search_url_by_search_engine()
method (just before return
) in scraping.py
: 在下面插入线的端部
get_base_search_url_by_search_engine()
方法(只是之前return
在) scraping.py
:
if("google" in str(specific_base_url)):
specific_base_url = "https://www.google.com/search?tbs=qdr:{},sbd:1".format(config.get("time_filter", ""))
Now use the time_filter
option in your config: 现在,在您的配置中使用
time_filter
选项:
from GoogleScraper import scrape_with_config
config = {
'use_own_ip': True,
'keyword_file': "keywords.txt",
'search_engines': ['google'],
'num_pages_for_keyword': 2,
'scrape_method': 'http',
"time_filter": "d15" #up to 15 days ago
}
search = scrape_with_config(config)
Results will only include the time range. 结果将仅包括时间范围。 Additionally, text snippets in the results will have raw date information:
此外,结果中的文本片段将具有原始日期信息:
one_sample_result = search.serps[0].links[0]
print(one_sample_result.snippet)
4 mins ago It must be pretty easy - let propertytotalPriceOfOrder = order.items.map(item => +item.unit * +item.quantity * +item.price);.
4分钟前这一定很容易-让propertytotalPriceOfOrder = order.items.map(item => + item.unit * + item.quantity * + item.price);。 where order is your entire json object.
其中order是您的整个json对象。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.