I have a json compress with gzip file (.json.gz) stored in bucket in Google Cloud Storage in which I want to read it and copy it into a postgres table. The json.gz file I have is just a json file with no nested object in it like this:
[{
“date”: “2019-03-10T07:00:00.000Z”,
“type”: “chair”,
“total”: 250.0,
"payment": "cash"
},{
“date”: “2019-03-10T07:00:00.000Z”,
“type”: “shirt”,
“total”: 100.0,
"payment": "credit card"
},{
.
.
}]
Previously I did similar job like this with csv file in which I can use download_as_string
function and stored it in variable and use StringIO
to convert that variable to file-like object and used copy_expert()
function with the query ( this link )
So, how can I read a json.gz file in GCS and write it to a table with Python?
Thank you
To read the data in, I'd go with gcsfs , Python interface to GCS:
import gcsfs
import gzip
import json
fs = gcsfs.GCSFileSystem(project='my-project')
with fs.open('bucket/path.json.gz') as f:
gz = gzip.GzipFile(fileobj=f)
file_as_string = gz.read()
your_json = json.loads(file_as_string)
Now that you have your json, you can use the same code as you were using with csv.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.