简体   繁体   中英

Python faster than C++ in deserializing json. Why? Is the json library in Python 3 written in C/C++ or other low-level language?

C/C++ is well known for being in many cases faster than python. I made a test in this direction.

I have a large (beautified) JSON file with 2200 lines. The test consisted in reading the file, deserializing the data in memory (I used dictionaries as data structure) and displaying the content.

I performed the test both in python using the built-in json library and in C++ using the external nlohmann JSON library.

After a few runs, I had the shock to see that C++ takes 0.01 seconds and Python 3 takes about 0.001 seconds, which is almost 10 times faster!

I searched in the docs but I did not find information about what was used in writing the json library.

C++:

#include <iostream>
#include <string.h>
#include <boost/property_tree/json_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include "nlohmann/json.hpp"
using namespace std;
using json = nlohmann::json;
namespace pt = boost::property_tree;
#include <ctime>

int main()
{

    ifstream input;
    input.open("input.json");

    json json_data;

    input >> json_data; 

    cout << json_data << endl;

  return 0;
}

And Python:

import json
from time import time

t1 = time()
with open('output.json','r+') as f:
    f = json.load(f)

    print(f)
t2 = time()
elapsed = t2 - t1

print('elapsed time: '+str(elapsed))

Final question, is the json Python library by any chance written in any low level language and this is the main reason for performance, or is just pure Python?

a poorly written library, no matter what language it was written, can give you abyssal speed.

there are a few specialized and highly optimized JSON parser in C++, including rapidjson and simdjson, see this recent comparison:

https://lemire.me/blog/2020/03/31/we-released-simdjson-0-3-the-fastest-json-parser-in-the-world-is-even-better/

C/C++ is well known for being in many cases faster than python.

Not in many cases, always .

Of course, if your C/C++ code is badly written, it can be as slow as you want.

I performed the test both in python using the built-in json library and in C++ using the external nlohmann JSON library.

The nlohmann JSON library is slower than other alternatives. It is definitely possible that it is slower than CPython's implementation. Use another library if you need speed.

Having said that, please note that benchmarking is hard. It may be the case that, as @Jesper and @idclev mention, you are simply missing optimizations when compiling the C++ code.

is the json library by any chance written in any low level language and this is the main reason for performance, or is just pure python?

Yes, the CPython implementation is written in C as @jonrsharpe pointed out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM