简体   繁体   English

Python:将excel表转换为特定的JSON格式

[英]Python: Convert excel table to a specific JSON format

I have an excel table with a format:我有一个 excel 格式的表:

excel数据格式

I want to convert this table to a JSON format which looks like this:我想将此表转换为 JSON 格式,如下所示:

data = {'Program': {1: {'Name': 'John', 'Program': 'BS', 'Age': 29}, 2: {'Name': 'Doe', 'Program': 'MS', 'Age': 35}},  'Locations': {'New York': {1: 78, 2: 80, 3: 36, 4: 44}, 'Chicago': {1: 68, 2: 53, 3: 87, 4: 130}, 'Houston': {1: 57, 2: 89, 3: 64, 4: 77}, 'Alabama': {1: 98, 2: 124, 3: 73, 4: 82}}, 'name_ratings': {'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}}

I am using openpyxl to load the excel file in Python and iterating over rows.我正在使用 openpyxl 在 Python 中加载 excel 文件并遍历行。

for col in sheet.iter_rows(min_row=1, min_col=1, max_row=5, max_col=8):
    for cell in col:
        print(cell.value)

Can anyone please help me with this?谁能帮我解决这个问题?

File attached:sample excel file附件:样本excel文件

Thanks.谢谢。

First, sheet.iter_rows gives you rows, not columns.首先, sheet.iter_rows给你的是行,而不是列。 If you don't care about the column names as given in the sheet, you may want pass in min_row=2 .如果您不关心工作表中给出的列名,则可能需要传入min_row=2

for row in sheet.iter_rows(min_row=2, min_col=1, max_row=5, max_col=8):
    print([cell.value for cell in row])

['John', 'BS', 29, 0.2, 78, 68, 57, 98]
['Doe', 'MS', 35, 0.7, 80, 53, 89, 124]
['Jessica', 'MS', 26, 0.4, 36, 87, 64, 73]
['Alley', 'BS', 33, 0.9, 44, 130, 77, 82]

Then you can do your aggregations然后你可以做你的聚合

name_ratings = {}
programs = {}  # fill all this in
...
for row in sheet.iter_rows(min_row=2, min_col=1, max_row=5, max_col=8):
    name_ratings[row[0].value] = row[3].value
    ...
{'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}
...
result = json.dumps({"name_ratings": name_ratings, "programs": programs})  # and the other values

For easier access, you could also use a different library like pandas to load the data为了更容易访问,您还可以使用不同的库,如 pandas 来加载数据

import pandas
pandas.read_excel(path)

      Name Program  Age  Rating  New York  Chicago  Houston  Alabama
0     John      BS   29     0.2        78       68       57       98
1      Doe      MS   35     0.7        80       53       89      124
2  Jessica      MS   26     0.4        36       87       64       73
3    Alley      BS   33     0.9        44      130       77       82

It's usually more efficient if you can avoid iterating by rows.如果可以避免按行迭代,通常效率会更高。

For this you can simply manipulate the dataframe according to how to structure the json then use zip or .to_json() to then construct your final dictionary/json.为此,您可以根据如何构建 json 来简单地操作 dataframe,然后使用zip.to_json()来构建最终的字典/json。

import pandas as pd
import json

df = pd.read_excel('D:/test/sample.xlsx')

program_cols = ['Name','Program','Age']
rating_cols = ['Name','Rating']
location_cols = [col for col in df.columns if col not in program_cols + rating_cols ]


programs = json.loads(df[program_cols].to_json(orient='index'))
locations = json.loads(df[location_cols].T.to_json(orient='index'))
name_ratings = dict (zip(df['Name'], df['Rating']))

data = {'Program': programs,
        'Locations': locations,
        'name_ratings': name_ratings}

Output: Output:

print(data)
{'Program': {'0': {'Name': 'John', 'Program': 'BS', 'Age': 29.0}, '1': {'Name': 'Doe', 'Program': 'MS', 'Age': 35.0}, '2': {'Name': 'Jessica', 'Program': 'MS', 'Age': 26.0}, '3': {'Name': 'Alley', 'Program': 'BS', 'Age': 33.0}}, 'Locations': {'New York': {'0': 78.0, '1': 80.0, '2': 36.0, '3': 44.0}, 'Chicago': {'0': 68.0, '1': 53.0, '2': 87.0, '3': 130.0}, ' Houston': {'0': 57.0, '1': 89.0, '2': 64.0, '3': 77.0}, 'Alabama': {'0': 98.0, '1': 124.0, '2': 73.0, '3': 82.0}}, 'name_ratings': {'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM