简体   繁体   English

分析海量区块链数据

[英]Analyse huge amount of blockchain data

I am trying to go over all transactions data from every block on the bitcoin blockchain from the previous 4 years.我正在尝试对过去 4 年比特币区块链上每个区块的所有交易数据进行 go。 With almost 2k transaction per block, it will take a lot of queries per block.每个区块几乎有 2k 笔交易,每个区块将需要大量查询。 I have a full node running locally and I tried two ways:我有一个在本地运行的完整节点,我尝试了两种方法:

Python with RPC: This is very slow and keeps losing connection after some time (httpx.ReadTimeout) Python 与 RPC:这非常慢并且在一段时间后不断失去连接 (httpx.ReadTimeout)

Python with os.popen commands: Doesn't have the connection problem, but still very slow. Python 使用 os.popen 命令:没有连接问题,但仍然很慢。

Would there be any other way?还有别的办法吗? Any recommendation on how to analyze bulk data from the blockchain?关于如何分析来自区块链的批量数据有什么建议吗? The methods listed above are unfeasible given the time it would take.考虑到所需的时间,上面列出的方法是不可行的。

EDIT: The problem isn't memory, but the time the bitcoin node takes to answer the queries.编辑:问题不是 memory,而是比特币节点回答查询所花费的时间。

Hey there are differents ways to fetch bitcoin blockchain data:嘿,有不同的方法来获取比特币区块链数据:

  • Network level using P2P messages (this method doesn't require to setup a node)使用P2P 消息的网络级别(此方法不需要设置节点)
  • Parsing .blk files which are synchronized by your node解析由您的节点同步的.blk文件
  • Querying the application interface RPC查询应用接口RPC

P2P messages and .blk files are raw encoded, so you will need to decode blocks and transactions. P2P 消息和.blk文件是原始编码的,因此您需要解码块和交易。

The RPC interface abstract the raw decoding but it's slower (because it decodes). RPC 接口抽象了原始解码但速度较慢(因为它解码)。

We wrote a paper with Matthieu Latapy to give instructions about collecting the whole Bitcoin blockchain and indexing in order to make parsing efficient.我们与 Matthieu Latapy 一起写了一篇论文,提供有关收集整个比特币区块链和索引以提高解析效率的说明。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM