简体   繁体   中英

Trying to convert a big tsv file to json

I've a tsv file, which I need to convert it into a json file. I'm using this python script which is exporting a empty json file.

import json
data={}
with open('data.json', 'w') as outfile,open("data.tsv","r") as f:
for line in f:
   sp=line.split()
   data.setdefault("data",[])
json.dump(data, outfile)

This can be done by pandas , but am not sure about performance

df.to_json

 df = pd.read_csv('data.tsv',sep='\t') # read your tsv file 
 df.to_json('data.json') #save it as json . refer orient='values' or 'columns' as per your requirements 

You never use the sp in your code.

To properly convert the tsv, you should read the first line separately, to get the "column names", then read the following lines and populate a list of dictionaries.

Here's what your code should look like:

import json
data=[{}]
with open('data.json', 'w') as outfile, open("data.tsv","r") as f:
firstline = f.readline()
columns = firstline.split()
lines = f.readlines()[1:]
for line in lines:
    values = line.split()
    entry = dict(zip(columns, values))
    data.append(entry)
json.dump(data, outfile)

This will output a file containing a list of tsv rows as objects.

Nowadays, it is very simple to solve problems in this way.

You can try https://toolsofweb.com/tsv-to-json for TSV to JSON

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM