简体   繁体   English

导出 google 学者结果中的链接

[英]Exporting links in google scholar results

I want to export data from google scholar.我想从谷歌学者导出数据。 In particular, I want to export a list of articles that cite a particular paper.特别是,我想导出引用特定论文的文章列表。 If I click the Cited By link I can get this page.如果我点击Cited By者链接,我可以得到这个页面。 One way I can export these data is to add all of them to my library.我可以导出这些数据的一种方法是将它们全部添加到我的库中。 Then you can export in 4 different formats (BibTex, Refman, Endnote, CSV).然后您可以导出 4 种不同的格式(BibTex、Refman、Endnote、CSV)。 However, none of these export formats include the HTML link (URL) to each paper.但是,这些导出格式均不包括每篇论文的 HTML 链接 (URL)。

The other strategy would be to scrape the data, but I don't want to do that as I know that this can be very tricky with google scholar's captchas.另一种策略是抓取数据,但我不想这样做,因为我知道这对于谷歌学者的验证码可能非常棘手。

Is there a way to export the results of a google scholar search that includes the URLs of each paper?有没有办法导出包含每篇论文 URL 的谷歌学者搜索结果?

For the page you're on you mean?对于您所在的页面,您的意思是? From the console (F12) do:从控制台 (F12) 执行:

copy($$('li > a').map(a => a.href))

Now they're in your clipboard.现在它们在您的剪贴板中。

To extract Cited by data, you'll need an ID of a Google Scholar organic search result that the Cited by link belongs to.要提取Cited by数据,您需要一个Cited by链接所属的 Google 学术搜索自然搜索结果的 ID。 You can find the ID inside data-cid html attribute.您可以在data-cid html 属性中找到 ID。

在此处输入图像描述

You can then query the next link to retrieve the data: https://scholar.google.com/scholar?q=info: this_is_where_you_put_the_cite_id:scholar.google.com/&output=cite然后,您可以查询下一个链接以检索数据: https://scholar.google.com/scholar?q=info: this_is_where_you_put_the_cite_id:scholar.google.com/&output=cite

There is also a third party solution like SerpApi to do this for you.还有像 SerpApi 这样的第三方解决方案可以为您执行此操作。 It's a paid API with a free trial.它是付费的 API,可免费试用。

Example python code (available in other libraries also):示例 python 代码(也可在其他库中获得):

from serpapi import GoogleSearch

params = {
  "engine": "google_scholar_cite",
  "q": "FDc6HiktlqEJ",
  "api_key": "secret_api_key",
}

search = GoogleSearch(params)
results = search.get_dict()

Example JSON output:示例 JSON output:

"citations": [
  {
    "title": "MLA",
    "snippet": "Schwertmann, U. T. R. M., and Reginald M. Taylor. \"Iron oxides.\" Minerals in soil environments 1 (1989): 379-438."
  },
  {
    "title": "APA",
    "snippet": "Schwertmann, U. T. R. M., & Taylor, R. M. (1989). Iron oxides. Minerals in soil environments, 1, 379-438."
  },
  {
    "title": "Chicago",
    "snippet": "Schwertmann, U. T. R. M., and Reginald M. Taylor. \"Iron oxides.\" Minerals in soil environments 1 (1989): 379-438."
  },
  {
    "title": "Harvard",
    "snippet": "Schwertmann, U.T.R.M. and Taylor, R.M., 1989. Iron oxides. Minerals in soil environments, 1, pp.379-438."
  },
  {
    "title": "Vancouver",
    "snippet": "Schwertmann UT, Taylor RM. Iron oxides. Minerals in soil environments. 1989 Jan 1;1:379-438."
  }
],
"links": [
  {
    "name": "BibTeX",
    "link": "https://scholar.googleusercontent.com/scholar.bib?q=info:FDc6HiktlqEJ:scholar.google.com/&output=citation&scisdr=CgXpniNQGAA:AAGBfm0AAAAAYMu3WkYJI4po_pgcUVKgwwFp1dl5uNYk&scisig=AAGBfm0AAAAAYMu3WlZR_joxo-i8FTZ1CphjzmW_d447&scisf=4&ct=citation&cd=-1&hl=en"
  },
  {
    "name": "EndNote",
    "link": "https://scholar.googleusercontent.com/scholar.enw?q=info:FDc6HiktlqEJ:scholar.google.com/&output=citation&scisdr=CgXpniNQGAA:AAGBfm0AAAAAYMu3WkYJI4po_pgcUVKgwwFp1dl5uNYk&scisig=AAGBfm0AAAAAYMu3WlZR_joxo-i8FTZ1CphjzmW_d447&scisf=3&ct=citation&cd=-1&hl=en"
  },
  {
    "name": "RefMan",
    "link": "https://scholar.googleusercontent.com/scholar.ris?q=info:FDc6HiktlqEJ:scholar.google.com/&output=citation&scisdr=CgXpniNQGAA:AAGBfm0AAAAAYMu3WkYJI4po_pgcUVKgwwFp1dl5uNYk&scisig=AAGBfm0AAAAAYMu3WlZR_joxo-i8FTZ1CphjzmW_d447&scisf=2&ct=citation&cd=-1&hl=en"
  },
  {
    "name": "RefWorks",
    "link": "https://scholar.googleusercontent.com/scholar.rfw?q=info:FDc6HiktlqEJ:scholar.google.com/&output=citation&scisdr=CgXpniNQGAA:AAGBfm0AAAAAYMu3WkYJI4po_pgcUVKgwwFp1dl5uNYk&scisig=AAGBfm0AAAAAYMu3WlZR_joxo-i8FTZ1CphjzmW_d447&scisf=1&ct=citation&cd=-1&hl=en"
  }
]

Check out the documentation for more details.查看文档以获取更多详细信息。

Disclaimer: I work at SerpApi.免责声明:我在 SerpApi 工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM