简体   繁体   English

将 Python-Scopus API 结果导出为 CSV

[英]Export Python-Scopus API results into CSV

I'm very new to Python so not sure if this can be done but I hope it can!我对 Python 很陌生,所以不确定这是否可以完成,但我希望可以!

I have accessed the Scopus API and managed to run a search query which gives me the following results in a pandas dataframe:我已经访问了 Scopus API 并设法运行了一个搜索查询,它在 Pandas 数据框中给出了以下结果:

                                                            search-results
entry                    [{'@_fa': 'true', 'affiliation': [{'@_fa': 'tr...
link                     [{'@_fa': 'true', '@ref': 'self', '@type': 'ap...
opensearch:Query         {'@role': 'request', '@searchTerms': 'AFFIL(un...
opensearch:itemsPerPage                                                200
opensearch:startIndex                                                    0
opensearch:totalResults                                             106652

If possible, I'd like to export the 106652 results into a csv file so that they can be analysed.如果可能,我想将 106652 结果导出到一个 csv 文件中,以便对其进行分析。 Is this possible at all?这可能吗?

first you need to get all the results (see comments under question).首先,您需要获得所有结果(请参阅问题下的评论)。 The data you need (search results) is inside the "entry" list.您需要的数据(搜索结果)在“条目”列表中。 You can extract that list and append it to a support list, iterating until you got all the results.您可以提取该列表并将其附加到支持列表中,迭代直到获得所有结果。 Here i cycle and at every round i subtract the downloaded items (count) from the total number of results.在这里,我循环并在每一轮中从结果总数中减去下载的项目(计数)。

        found_items_num = 1
        start_item = 0
        items_per_query = 25
        max_items = 2000
        JSON = []

        print ('GET data from Search API...')

        while found_items_num > 0:

            resp = requests.get(self._url,
                                headers={'Accept': 'application/json', 'X-ELS-APIKey': MY_API_KEY},
                                params={'query': query, 'view': view, 'count': items_per_query,
                                        'start': start_item})

            print ('Current query url:\n\t{}\n'.format(resp.url))

            if resp.status_code != 200:
                # error
                raise Exception('ScopusSearchApi status {0}, JSON dump:\n{1}\n'.format(resp.status_code, resp.json()))

            # we set found_items_num=1 at initialization, on the first call it has to be set to the actual value
            if found_items_num == 1:
                found_items_num = int(resp.json().get('search-results').get('opensearch:totalResults'))
                print ('GET returned {} articles.'.format(found_items_num))

            if found_items_num == 0:
                pass
            else:
                # write fetched JSON data to a file.
                out_file = os.path.join(str(start_item) + '.json')

                with open(out_file, 'w') as f:
                    json.dump(resp.json(), f, indent=4)
                    f.close()

                # check if results number exceed the given limit
                if found_items_num > max_items:
                    print('WARNING: too many results, truncating to {}'.format(max_items))
                    found_items_num = max_items



                # check if returned some result
                if 'entry' in resp.json().get('search-results', []):
                    # combine entries to make a single JSON
                    JSON += resp.json()['search-results']['entry']
            # set counters for the next cycle
            self._found_items_num -= self._items_per_query
            self._start_item += self._items_per_query
            print ('Still {} results to be downloaded'.format(self._found_items_num if self._found_items_num > 0 else 0))

        # end while - finished downloading JSON data

then, outside the while, you can save the complete file like this...然后,在一段时间之外,您可以像这样保存完整的文件......

out_file = os.path.join('articles.json')
        with open(out_file, 'w') as f:
            json.dump(JSON, f, indent=4)
            f.close()

or you can follow this guide i found online (not tested, you can search 'json to cvs python' and you get many guides) to convert the json data to a csv或者您可以按照我在网上找到的本指南(未测试,您可以搜索“json to cvs python”并获得许多指南)将 json 数据转换为 csv

Since this question was created, Elsevier published a Python library that makes searching and retrieval/storage of results quite a bit easier: 自从创建此问题以来,Elsevier发布了一个Python库,该库使搜索和检索/存储结果变得更加容易:

https://github.com/ElsevierDev/elsapy/tree/master/elsapy https://github.com/ElsevierDev/elsapy/tree/master/elsapy

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM