简体   繁体   English

Python 在反序列化 json 方面比 C++ 快。 为什么? Python 3 中的 json 库是用 C/C++ 还是其他低级语言编写的?

[英]Python faster than C++ in deserializing json. Why? Is the json library in Python 3 written in C/C++ or other low-level language?

C/C++ is well known for being in many cases faster than python. C/C++ 以在许多情况下比 python 更快而闻名。 I made a test in this direction.我在这个方向做了一个测试。

I have a large (beautified) JSON file with 2200 lines.我有一个包含 2200 行的大型(美化)JSON 文件。 The test consisted in reading the file, deserializing the data in memory (I used dictionaries as data structure) and displaying the content.测试包括读取文件,反序列化 memory 中的数据(我使用字典作为数据结构)并显示内容。

I performed the test both in python using the built-in json library and in C++ using the external nlohmann JSON library.我在 python 中使用内置json库和在 C++ 中使用外部nlohmann Z0ECD11Z14D7A287401 库进行了测试。

After a few runs, I had the shock to see that C++ takes 0.01 seconds and Python 3 takes about 0.001 seconds, which is almost 10 times faster!几次运行后,我震惊地看到 C++ 需要 0.01 秒,而 Python 3 需要大约 0.001 秒,这几乎快了 10 倍!

I searched in the docs but I did not find information about what was used in writing the json library.我在文档中进行了搜索,但没有找到有关编写json库的信息。

C++: C++:

#include <iostream>
#include <string.h>
#include <boost/property_tree/json_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include "nlohmann/json.hpp"
using namespace std;
using json = nlohmann::json;
namespace pt = boost::property_tree;
#include <ctime>

int main()
{

    ifstream input;
    input.open("input.json");

    json json_data;

    input >> json_data; 

    cout << json_data << endl;

  return 0;
}

And Python:和 Python:

import json
from time import time

t1 = time()
with open('output.json','r+') as f:
    f = json.load(f)

    print(f)
t2 = time()
elapsed = t2 - t1

print('elapsed time: '+str(elapsed))

Final question, is the json Python library by any chance written in any low level language and this is the main reason for performance, or is just pure Python?最后一个问题, json Python 库是用任何低级语言编写的,这是性能的主要原因,还是纯粹的 Python?

a poorly written library, no matter what language it was written, can give you abyssal speed.一个写得不好的库,不管它是用什么语言写的,都会给你带来极快的速度。

there are a few specialized and highly optimized JSON parser in C++, including rapidjson and simdjson, see this recent comparison: C++ 中有一些专门且高度优化的 JSON 解析器,包括 rapidjson 和 simdjson,请参阅最近的比较:

https://lemire.me/blog/2020/03/31/we-released-simdjson-0-3-the-fastest-json-parser-in-the-world-is-even-better/ https://lemire.me/blog/2020/03/31/we-released-simdjson-0-3-the-fastest-json-parser-in-the-world-is-even-better/

C/C++ is well known for being in many cases faster than python. C/C++ 以在许多情况下比 python 更快而闻名。

Not in many cases, always .不是在很多情况下,总是

Of course, if your C/C++ code is badly written, it can be as slow as you want.当然,如果你的 C/C++ 代码写得不好,它可以随心所欲地变慢。

I performed the test both in python using the built-in json library and in C++ using the external nlohmann JSON library.我在 python 中使用内置 json 库和在 C++ 中使用外部 nlohmann Z0ECD11Z14D7A287401 库进行了测试。

The nlohmann JSON library is slower than other alternatives. nlohmann JSON 库比其他替代库慢。 It is definitely possible that it is slower than CPython's implementation.它绝对有可能比 CPython 的实现慢。 Use another library if you need speed.如果您需要速度,请使用另一个库。

Having said that, please note that benchmarking is hard.话虽如此,请注意基准测试很难。 It may be the case that, as @Jesper and @idclev mention, you are simply missing optimizations when compiling the C++ code.正如@Jesper 和@idclev 所提到的,您可能只是在编译 C++ 代码时缺少优化。

is the json library by any chance written in any low level language and this is the main reason for performance, or is just pure python? json 库是用任何低级语言编写的,这是性能的主要原因,还是纯粹的 python?

Yes, the CPython implementation is written in C as @jonrsharpe pointed out.是的,正如@jonrsharpe 指出的那样,CPython 实现是 用 C 编写的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM