简体   繁体   English

列出 npm 注册表中的所有公共包

[英]List all public packages in the npm registry

For research purposes, I'd like to list all the packages that are available on npm.出于研究目的,我想列出 npm 上可用的所有包。 How can I do this?我该怎么做?

Some old docs at https://github.com/npm/registry/blob/master/docs/REGISTRY-API.md#get-all mention an /-/all endpoint that presumably once worked, but http://registry.npmjs.org/-/all now just returns {"message":"deprecated"} . https://github.com/npm/registry/blob/master/docs/REGISTRY-API.md#get-all 上的一些旧文档提到了一个可能曾经工作过的// /-/all端点,但http://registry. npmjs.org/-/all现在只返回{"message":"deprecated"}

http://blog.npmjs.org/post/157615772423/deprecating-the-all-registry-endpoint describes the deprecation of the http://registry.npmjs.org/-/all endpoint, and links to the tutorial at https://github.com/npm/registry/blob/master/docs/follower.md as an alternative approach. http://blog.npmjs.org/post/157615772423/deprecating-the-all-registry-endpoint描述了http://registry.npmjs.org/-/all端点的弃用,并链接到https上的教程://github.com/npm/registry/blob/master/docs/follower.md作为替代方法。 That tutorial describes how to set up a "follower" that receives all changes made to the NPM registry.该教程描述了如何设置一个“follower”来接收对 NPM 注册表所做的所有更改。 That's... a bit odd, honestly.老实说,这……有点奇怪。 Clearly such a follower is not an adequate substitute for getting a list of all packages if you want to do data analysis on the entire NPM ecosystem.显然,如果您想对整个 NPM 生态系统进行数据分析,那么这样的跟随者并不足以替代获取所有包的列表。

However, within that codebase we learn that at the heart of the NPM registry is a CouchDB database located at https://replicate.npmjs.com .但是,在该代码库中,我们了解到 NPM 注册表的核心是位于https://replicate.npmjs.com的 CouchDB 数据库。 The _all_docs endpoint is not disabled, so we can hit it at https://replicate.npmjs.com/_all_docs to get back a JSON object whose rows property contains a list of all public packages on NPM. _all_docs端点没有被禁用,所以我们可以在https://replicate.npmjs.com/_all_docs上点击它来获取一个 JSON 对象,其rows属性包含 NPM 上所有公共包的列表。 Each package looks like:每个包看起来像:

{"id":"lodash","key":"lodash","value":{"rev":"634-9273a19c245f088da22a9e4acbabc213"}},

At the point that I write this, there are 618660 rows in that response and it comes to around 64MB.在我写这篇文章的时候,响应中有 618660 行,大约 64MB。

If you want more data about a particular package, you can look up a particular package using its key - eg hit https://replicate.npmjs.com/lodash to get a huge document containing stuff like Lodash's description and release history.如果你想要更多关于某个特定包的数据,你可以使用它的key来查找一个特定的包 - 例如点击https://replicate.npmjs.com/lodash来获得一个包含 Lodash 描述和发布历史等内容的巨大文档。

If you want all the current data about all packages, you could use the include_docs parameter to _all_docs to include the actual document bodies in the response - ie hit https://replicate.npmjs.com/_all_docs?include_docs=true .如果您想要有关所有包的所有当前数据,您可以使用_all_docsinclude_docs参数在_all_docs中包含实际文档正文 - 即点击https://replicate.npmjs.com/_all_docs?include_docs=true Be ready for a lot of data.为大量数据做好准备。

If you need yet more data, like download counts, that is not included in these CouchDB documents, then it is worth perusing the docs at https://github.com/npm/registry/tree/master/docs which detail some other available APIs - with the caveat, noted in the question, that not everything documented there actually works.如果您需要更多数据,例如下载计数,这些数据未包含在这些 CouchDB 文档中,那么值得仔细阅读https://github.com/npm/registry/tree/master/docs上的文档,其中详细介绍了一些其他可用的APIs - 问题中提到的警告,并不是所有记录在那里的东西实际上都有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM