Python 將 JsonL 轉換為 CSV 文件時插入 CSV 文件的標頭

Question

目前正在研究將文件從 jsonl 格式轉換為 CSV 格式的腳本，最后我寫了一行來包含 csv 文件的 header 以識別每個轉換的變量。 However, it seems that the CSV file generated from the script seems to have a header for each json line read, and I want just a file that has the header on row 1 with the rest of the values below, instead of a separate header for每個單獨的 json 行讀取。 我希望有人可以幫助我解決這個問題，謝謝！

示例 jsonl：

{"symbol": "DOGE-PERP", "timestamp": 1621948955550, "datetime": "2021-05-25T13:22:35.550Z", "high": null, "low": null, "bid": 0.342372, "bidVolume": null, "ask": 0.3424855, "askVolume": null, "vwap": null, "open": null, "close": 0.3424025, "last": 0.3424025, "previousClose": null, "change": null, "percentage": 0.039249281423858244, "average": null, "baseVolume": null, "quoteVolume": 433162290.0506585, "info": {"name": "DOGE-PERP", "enabled": true, "postOnly": false, "priceIncrement": "5e-7", "sizeIncrement": "1.0", "minProvideSize": "1.0", "last": "0.3424025", "bid": "0.342372", "ask": "0.3424855", "price": "0.3424025", "type": "future", "baseCurrency": null, "quoteCurrency": null, "underlying": "DOGE", "restricted": false, "highLeverageFeeExempt": false, "change1h": "0.023470298206100425", "change24h": "0.039249281423858244", "changeBod": "-0.07136396489976689", "quoteVolume24h": "433162290.0506585", "volumeUsd24h": "433162290.0506585"}}
{"symbol": "DOGE-PERP", "timestamp": 1621948955976, "datetime": "2021-05-25T13:22:35.976Z", "high": null, "low": null, "bid": 0.3424955, "bidVolume": null, "ask": 0.3427185, "askVolume": null, "vwap": null, "open": null, "close": 0.3427185, "last": 0.3427185, "previousClose": null, "change": null, "percentage": 0.04020839466903005, "average": null, "baseVolume": null, "quoteVolume": 433162290.0506585, "info": {"name": "DOGE-PERP", "enabled": true, "postOnly": false, "priceIncrement": "5e-7", "sizeIncrement": "1.0", "minProvideSize": "1.0", "last": "0.3427185", "bid": "0.3424955", "ask": "0.3427185", "price": "0.3427185", "type": "future", "baseCurrency": null, "quoteCurrency": null, "underlying": "DOGE", "restricted": false, "highLeverageFeeExempt": false, "change1h": "0.024414849178225707", "change24h": "0.04020839466903005", "changeBod": "-0.07050693556414092", "quoteVolume24h": "433162290.0506585", "volumeUsd24h": "433162290.0506585"}}

CSV 文件目前的樣子：

我想要達到的目標：

我的腳本：

import glob
import json
import csv

import time


start = time.time()
#import pandas as pd
from flatten_json import flatten

#Path of jsonl file
File_path = (r'C:\Users\Natthanon\Documents\Coding 101\Python\JSONL')
#reading all jsonl files
files = [f for f in glob.glob( File_path + "**/*.jsonl", recursive=True)]
i=0
for f in files:
    with open(f, 'r') as F:
        for line in F:
#flatten json files 
            data = json.loads(line)
            data_1=flatten(data)
#creating csv files  
            with open(r'C:\Users\Natthanon\Documents\Coding 101\Python\CSV\\' + f.split("\\")[-1] +".csv", 'a' , newline='') as csv_file:
                thewriter = csv.writer(csv_file)
                thewriter.writerow(["symbol", "timestamp", "datetime","high","low","bid","bidVolume","ask","askVolume","vwap","open","close","last","previousClose","change","percentage","average","baseVolume","quoteVolume"])
#headers should be the Key values from json files that make Coulmn header                    
                thewriter.writerow([data_1['symbol'],data_1['timestamp'],data_1['datetime'],data_1['high'],data_1['low'],data_1['bid'],data_1['bidVolume'],data_1['ask'],data_1['askVolume'],data_1['vwap'],data_1['open'],data_1['close'],data_1['last'],data_1['previousClose'],data_1['change'],data_1['percentage'],data_1['average'],data_1['baseVolume'],data_1['quoteVolume']])

Answer 1

在開始解析行之前，您需要將 output CSV 文件open()移動到，例如：

import glob
import json
import csv
import time


start = time.time()
#import pandas as pd
from flatten_json import flatten

#Path of jsonl file
File_path = (r'C:\Users\Natthanon\Documents\Coding 101\Python\JSONL')
#reading all jsonl files
files = [f for f in glob.glob( File_path + "**/*.jsonl", recursive=True)]
i = 0

for f in files:
    with open(f, 'r') as F:
        #creating csv files  
        with open(r'C:\Users\Natthanon\Documents\Coding 101\Python\CSV\\' + f.split("\\")[-1] + ".csv", 'w' , newline='') as csv_file:
            thewriter = csv.writer(csv_file)
            thewriter.writerow(["symbol", "timestamp", "datetime","high","low","bid","bidVolume","ask","askVolume","vwap","open","close","last","previousClose","change","percentage","average","baseVolume","quoteVolume"])

            for line in F:
                #flatten json files 
                data = json.loads(line)
                data_1 = flatten(data)
                #headers should be the Key values from json files that make Column header                    
                thewriter.writerow([data_1['symbol'],data_1['timestamp'],data_1['datetime'],data_1['high'],data_1['low'],data_1['bid'],data_1['bidVolume'],data_1['ask'],data_1['askVolume'],data_1['vwap'],data_1['open'],data_1['close'],data_1['last'],data_1['previousClose'],data_1['change'],data_1['percentage'],data_1['average'],data_1['baseVolume'],data_1['quoteVolume']])

在您的代碼中，您將打開和關閉每一行的文件，並且每次都添加 header。

Python 將 JsonL 轉換為 CSV 文件時插入 CSV 文件的標頭

問題描述

1 個解決方案

解決方案1
1 已采納 2021-06-06 16:52:52

Python 將 JsonL 轉換為 CSV 文件時插入 CSV 文件的標頭

問題描述

1 個解決方案

解決方案1 1 已采納 2021-06-06 16:52:52

解決方案1
1 已采納 2021-06-06 16:52:52