简体   繁体   English

将文本文件转换为特定格式的 json ( python )

[英]Converting text file into json in a specific format ( python )

I have data of some stock in text format and I want to convert it into JSON in a specific format.我有一些文本格式的股票数据,我想将其转换为特定格式的 JSON。 The data points in the text file are separated by commas (,) and each line contains data of 1 min interval.文本文件中的数据点用逗号(,)分隔,每行包含 1 分钟间隔的数据。 Also in some lines, there are extra unnecessary data at the end, so I want to make sure after conversion only the six datapoints are present ( excluding the first and any data after the 7th data point)同样在某些行中,最后还有多余的不必要数据,所以我想确保在转换后只存在六个数据点(不包括第一个数据点和第 7 个数据点之后的任何数据)

Input data:输入数据:

BANKNIFTY_F1,20150228,15:27,19904.65,19924.00,19900.40,19920.20,31225
BANKNIFTY_F1,20150228,15:28,19921.05,19941.30,19921.05,19937.00,31525
BANKNIFTY_F1,20150228,15:29,19932.45,19945.00,19930.10,19945.00,38275
BANKNIFTY_F1,20150228,15:30,19947.00,19949.40,19930.00,19943.80,43400
BANKNIFTY_F1,20150302,09:16,20150.15,20150.15,20021.50,20070.00,91775,2026525
BANKNIFTY_F1,20150302,09:17,20071.50,20085.00,20063.50,20063.50,45700,2026525

Expected output data:预期 output 数据:

[{"date":"20150228","time":"15:27","open":"19904.65","high":"19924.00","low":"19900.40","close":"19920.20","volume":"31225"},{"date": "20150228", "time":"15:28", "open":"19921.05","high":"19941.30" ,"low":"19921.05","close":"19937.00", "volume":"31525"}, {"date":"20150228","time" :"15:29" ,"open": "19932.45" ,"high" :"19945.00 ","low":"19930.10","close" :"19945.00","volume":"38275"},{"date": "20150228","time ":" 15:30","open ":"19947.00","high" :"19949.40","low":"19930.00" ,"close":"19943.80", "volume":"43400"} , {"date": "20150302","time" :"09:16","open":"20150.15","high ":"20150.15", "low":"20021.50", "close":"20070.00 ","volume":"91775"}, {"date":"20150302", "time": "09:17","open": "20071.50", "high":"20085.00" , "low":"20063.50", "close":"20063.50", "volume": "45700"}

Please note in the expected output the last unnecessary datapoint as shown in the last two input lines is ignored.请注意,在预期的 output 中,最后两个输入行中显示的最后一个不必要的数据点被忽略。

You want to transform a csv file to JSON.您想将 csv 文件转换为 JSON。 When working with CSV files in python, always think about Pandas dataframes.在使用 python 中的 CSV 文件时,请始终考虑 Pandas 数据帧。 So first install Pandas (pip install pandas).所以首先安装 Pandas (pip install pandas)。

Read the csv file as a Pandas dataframe, set the column headers to your keys, and then transform to json using the Pandas built-in functionality to_dict . Read the csv file as a Pandas dataframe, set the column headers to your keys, and then transform to json using the Pandas built-in functionality to_dict . Just a few lines of code.只需几行代码。

You will first need to clean out the lines of the file that you do not need.您首先需要清除不需要的文件行。 If you only want the first X columns, also use parameters in pd.read_csv to selectd specific columns.如果您只想要前 X 列,还可以使用pd.read_csv中的参数来选择特定的列。 Then do this:然后这样做:

import pandas as pd

dataframe = pd.read_csv("stockdata.txt", header = None, names = ["date","time","open","high","low","close","volume"])

// this is a python dictionary
json_dictionary = dataframe.to_dict('records')

print(json_dictionary)

// optionally convert to a json string
json_string = json_dictionary.dumps()

You can alo use pd.read_csv to set specific data types for your columns您也可以使用pd.read_csv为您的列设置特定的数据类型

Assumed all the lines in the text file are built the same way you could iterate on the text file line by line and break it in a strict way like:假设文本文件中的所有行都以相同的方式构建,您可以逐行迭代文本文件并以严格的方式将其中断,例如:

my_tokens = []
for line in f.read():
    tokens = line.split(',')
    my_dict = {}
    try:
        my_dict['date'] = tokens[1]
        my_dict['time'] = tokens[2]
        my_dict['open'] = tokens[3]
        my_dict['high'] = tokens[4]
        my_dict['low'] = tokens[5]
        my_dict['close'] = tokens[6]
        my_dict['volume'] = tokens[7]
    except Exception as:
        continue
    my_tokens.append(my_dict)

That's not the prettiest answer but it works on your type of data (:这不是最漂亮的答案,但它适用于您的数据类型(:

You can simply do this using file handling in python.您可以简单地使用 python 中的文件处理来完成此操作。

import json
stocks = []

with open('stocks.txt', 'r') as data:
    for line in data:
        line = line.strip()
        ldata = line.split(',')
        temp_stock = {
            'date':ldata[1],
            'time':ldata[2],
            'open':ldata[3],
            'high':ldata[4],
            'low':ldata[5],
            'close':ldata[6],
            'volume':ldata[7]
        }
        stocks.append(temp_stock)
with open('stocks.json', 'w') as fp:
    json.dump(stocks, fp, indent=4)
from pprint import pprint
pprint(stocks)

Or else要不然

with open('stocks.txt', 'r') as data:
    res = [ {
            'date':line.strip().split(',')[1],
            'time':line.strip().split(',')[2],
            'open':line.strip().split(',')[3],
            'high':line.strip().split(',')[4],
            'low':line.strip().split(',')[5],
            'close':line.strip().split(',')[6],
            'volume':line.strip().split(',')[7]
        } for line in data ]

Output: Output:

  'date': '20150228',
  'high': '19924.00',
  'low': '19900.40',
  'open': '19904.65',
  'time': '15:27',
  'volume': '31225'},
 {'close': '19937.00',
  'date': '20150228',
  'high': '19941.30',
  'low': '19921.05',
  'open': '19921.05',
  'time': '15:28',
  'volume': '31525'},
 {'close': '19945.00',
  'date': '20150228',
  'high': '19945.00',
  'low': '19930.10',
  'open': '19932.45',
  'time': '15:29',
  'volume': '38275'},
 {'close': '19943.80',
  'date': '20150228',
  'high': '19949.40',
  'low': '19930.00',
  'open': '19947.00',
  'time': '15:30',
  'volume': '43400'},
 {'close': '20070.00',
  'date': '20150302',
  'high': '20150.15',
  'low': '20021.50',
  'open': '20150.15',
  'time': '09:16',
  'volume': '91775'},
 {'close': '20063.50',
  'date': '20150302',
  'high': '20085.00',
  'low': '20063.50',
  'open': '20071.50',
  'time': '09:17',
  'volume': '45700'}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM