[英]Python: Convert excel table to a specific JSON format
I have an excel table with a format:我有一个 excel 格式的表:
I want to convert this table to a JSON format which looks like this:我想将此表转换为 JSON 格式,如下所示:
data = {'Program': {1: {'Name': 'John', 'Program': 'BS', 'Age': 29}, 2: {'Name': 'Doe', 'Program': 'MS', 'Age': 35}}, 'Locations': {'New York': {1: 78, 2: 80, 3: 36, 4: 44}, 'Chicago': {1: 68, 2: 53, 3: 87, 4: 130}, 'Houston': {1: 57, 2: 89, 3: 64, 4: 77}, 'Alabama': {1: 98, 2: 124, 3: 73, 4: 82}}, 'name_ratings': {'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}}
I am using openpyxl to load the excel file in Python and iterating over rows.我正在使用 openpyxl 在 Python 中加载 excel 文件并遍历行。
for col in sheet.iter_rows(min_row=1, min_col=1, max_row=5, max_col=8):
for cell in col:
print(cell.value)
Can anyone please help me with this?谁能帮我解决这个问题?
File attached:sample excel file附件:样本excel文件
Thanks.谢谢。
First, sheet.iter_rows
gives you rows, not columns.首先,
sheet.iter_rows
给你的是行,而不是列。 If you don't care about the column names as given in the sheet, you may want pass in min_row=2
.如果您不关心工作表中给出的列名,则可能需要传入
min_row=2
。
for row in sheet.iter_rows(min_row=2, min_col=1, max_row=5, max_col=8):
print([cell.value for cell in row])
['John', 'BS', 29, 0.2, 78, 68, 57, 98]
['Doe', 'MS', 35, 0.7, 80, 53, 89, 124]
['Jessica', 'MS', 26, 0.4, 36, 87, 64, 73]
['Alley', 'BS', 33, 0.9, 44, 130, 77, 82]
Then you can do your aggregations然后你可以做你的聚合
name_ratings = {}
programs = {} # fill all this in
...
for row in sheet.iter_rows(min_row=2, min_col=1, max_row=5, max_col=8):
name_ratings[row[0].value] = row[3].value
...
{'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}
...
result = json.dumps({"name_ratings": name_ratings, "programs": programs}) # and the other values
For easier access, you could also use a different library like pandas to load the data为了更容易访问,您还可以使用不同的库,如 pandas 来加载数据
import pandas
pandas.read_excel(path)
Name Program Age Rating New York Chicago Houston Alabama
0 John BS 29 0.2 78 68 57 98
1 Doe MS 35 0.7 80 53 89 124
2 Jessica MS 26 0.4 36 87 64 73
3 Alley BS 33 0.9 44 130 77 82
It's usually more efficient if you can avoid iterating by rows.如果可以避免按行迭代,通常效率会更高。
For this you can simply manipulate the dataframe according to how to structure the json then use zip
or .to_json()
to then construct your final dictionary/json.为此,您可以根据如何构建 json 来简单地操作 dataframe,然后使用
zip
或.to_json()
来构建最终的字典/json。
import pandas as pd
import json
df = pd.read_excel('D:/test/sample.xlsx')
program_cols = ['Name','Program','Age']
rating_cols = ['Name','Rating']
location_cols = [col for col in df.columns if col not in program_cols + rating_cols ]
programs = json.loads(df[program_cols].to_json(orient='index'))
locations = json.loads(df[location_cols].T.to_json(orient='index'))
name_ratings = dict (zip(df['Name'], df['Rating']))
data = {'Program': programs,
'Locations': locations,
'name_ratings': name_ratings}
Output: Output:
print(data)
{'Program': {'0': {'Name': 'John', 'Program': 'BS', 'Age': 29.0}, '1': {'Name': 'Doe', 'Program': 'MS', 'Age': 35.0}, '2': {'Name': 'Jessica', 'Program': 'MS', 'Age': 26.0}, '3': {'Name': 'Alley', 'Program': 'BS', 'Age': 33.0}}, 'Locations': {'New York': {'0': 78.0, '1': 80.0, '2': 36.0, '3': 44.0}, 'Chicago': {'0': 68.0, '1': 53.0, '2': 87.0, '3': 130.0}, ' Houston': {'0': 57.0, '1': 89.0, '2': 64.0, '3': 77.0}, 'Alabama': {'0': 98.0, '1': 124.0, '2': 73.0, '3': 82.0}}, 'name_ratings': {'John': 0.2, 'Doe': 0.7, 'Jessica': 0.4, 'Alley': 0.9}}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.