Read CSV file that was exported from MongoDB in Python

Question

I am working for hours on loading a CSV file into Python using the well-known pd.read_csv('..')

However, there is a problem:

Error message : Error tokenizing data. C error: Expected 3991 fields in line 14, saw 4572

But yes, my code is without mistakes.

The CSV looks like this..

{"_id":{"$oid":"5cf683d88eb9ad12c84f6469"},"ID":"22991137","name":"M. LundstrÃ¶

Maybe the problem occurs because MongoDB is using strict BSON formats, but honestly - I do not know anything about that.

Does anyone have a solution ?

Answer 1

You can use pd.read_csv() only on a csv file. However the format looks like invalid JSON to me(parenthesis not closed).

You need to export this way for mongodb -

mongoexport --db dbname --collection col --type=csv --fields _id,field1,feild2 --out outfile.csv

EDIT:

if you want to read the JSON file only, you may read it like this -

import json

with open('filepath', 'rb') as f:
    data = json.load(f)
    print(data)