將 Python-Scopus API 結果導出為 CSV

Question

我對 Python 很陌生，所以不確定這是否可以完成，但我希望可以！

我已經訪問了 Scopus API 並設法運行了一個搜索查詢，它在 Pandas 數據框中給出了以下結果：

                                                            search-results
entry                    [{'@_fa': 'true', 'affiliation': [{'@_fa': 'tr...
link                     [{'@_fa': 'true', '@ref': 'self', '@type': 'ap...
opensearch:Query         {'@role': 'request', '@searchTerms': 'AFFIL(un...
opensearch:itemsPerPage                                                200
opensearch:startIndex                                                    0
opensearch:totalResults                                             106652

如果可能，我想將 106652 結果導出到一個 csv 文件中，以便對其進行分析。 這可能嗎？

Answer 1

首先，您需要獲得所有結果（請參閱問題下的評論）。 您需要的數據（搜索結果）在“條目”列表中。 您可以提取該列表並將其附加到支持列表中，迭代直到獲得所有結果。 在這里，我循環並在每一輪中從結果總數中減去下載的項目（計數）。

        found_items_num = 1
        start_item = 0
        items_per_query = 25
        max_items = 2000
        JSON = []

        print ('GET data from Search API...')

        while found_items_num > 0:

            resp = requests.get(self._url,
                                headers={'Accept': 'application/json', 'X-ELS-APIKey': MY_API_KEY},
                                params={'query': query, 'view': view, 'count': items_per_query,
                                        'start': start_item})

            print ('Current query url:\n\t{}\n'.format(resp.url))

            if resp.status_code != 200:
                # error
                raise Exception('ScopusSearchApi status {0}, JSON dump:\n{1}\n'.format(resp.status_code, resp.json()))

            # we set found_items_num=1 at initialization, on the first call it has to be set to the actual value
            if found_items_num == 1:
                found_items_num = int(resp.json().get('search-results').get('opensearch:totalResults'))
                print ('GET returned {} articles.'.format(found_items_num))

            if found_items_num == 0:
                pass
            else:
                # write fetched JSON data to a file.
                out_file = os.path.join(str(start_item) + '.json')

                with open(out_file, 'w') as f:
                    json.dump(resp.json(), f, indent=4)
                    f.close()

                # check if results number exceed the given limit
                if found_items_num > max_items:
                    print('WARNING: too many results, truncating to {}'.format(max_items))
                    found_items_num = max_items



                # check if returned some result
                if 'entry' in resp.json().get('search-results', []):
                    # combine entries to make a single JSON
                    JSON += resp.json()['search-results']['entry']
            # set counters for the next cycle
            self._found_items_num -= self._items_per_query
            self._start_item += self._items_per_query
            print ('Still {} results to be downloaded'.format(self._found_items_num if self._found_items_num > 0 else 0))

        # end while - finished downloading JSON data

然后，在一段時間之外，您可以像這樣保存完整的文件......

out_file = os.path.join('articles.json')
        with open(out_file, 'w') as f:
            json.dump(JSON, f, indent=4)
            f.close()

或者您可以按照我在網上找到的本指南（未測試，您可以搜索“json to cvs python”並獲得許多指南）將 json 數據轉換為 csv

Answer 2

自從創建此問題以來，Elsevier發布了一個Python庫，該庫使搜索和檢索/存儲結果變得更加容易：

https://github.com/ElsevierDev/elsapy/tree/master/elsapy

將 Python-Scopus API 結果導出為 CSV

問題描述

1 個解決方案

解決方案1
0 已采納 2016-10-04 16:25:08

解決方案2
0 2019-11-14 18:11:51

將 Python-Scopus API 結果導出為 CSV

問題描述

1 個解決方案

解決方案1 0 已采納 2016-10-04 16:25:08

解決方案2 0 2019-11-14 18:11:51

解決方案1
0 已采納 2016-10-04 16:25:08

解決方案2
0 2019-11-14 18:11:51