[英]Parse Nested Json in Python to Remove Special Characters in Columns
這是我的Json文件
{
"highest_table": {
"items": [{
"key": "Human 1",
"columns": {
"Na$me": "Tom",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "24",
"Ge_nder": "M"
}
},
{
"key": "Human 2",
"columns": {
"Na$me": "John",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "23",
"Ge_nder": "M"
}
}
]
}
}
目標是刪除列名中的所有特殊字符(或者,如果更容易的話,刪除.json文件中的所有特殊字符),並返回一個.json文件。 我最初的想法是將其轉換為熊貓,刪除列標題中的特殊字符,然后將其轉換回.json文件。
到目前為止,這是我嘗試過的。 它們都只打印一行。
import json
from pandas.io.json import json_normalize
data_file = r"C:\characters.json"
with open(data_file) as data_file:
data = json.load(data_file)
df = json_normalize(data)
-
data_file = r"C:\characters.json"
df = pd.read_json(data_file)
如何提取列,刪除特殊字符並將其放回.json文件中?
一點問題與fixkey
-您必須為fixkey
提供完整的實現,但這應該可以解決您的問題。
import json
def fixkey(key):
# toy implementation
#print("fixing {}".format(key))
return key.replace("&", "").replace("$", "")
def normalize(data):
#print("normalizing {}".format(data))
if isinstance(data, dict):
data = {fixkey(key): normalize(value) for key, value in data.items()}
elif isinstance(data, list):
data = [normalize(item) for item in data]
return data
jsdata = """
{
"highest_table": {
"items": [{
"key": "Human 1",
"columns": {
"Na$me": "Tom",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "24",
"Ge_nder": "M"
}
},
{
"key": "Human 2",
"columns": {
"Na$me": "John",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "23",
"Ge_nder": "M"
}
}
]
}
}
"""
data = json.loads(jsdata)
data = normalize(data)
result = json.dumps(data, indent=2)
print(result)
坦白地說,這很丑陋,但我還沒有找到更通用的方法。 這非常特定於您的特定JSON(問題確實需要在API中解決)。
import json
response = """{
"highest_table": {
"items": [{
"key": "Human 1",
"columns": {
"Na$me": "Tom",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "24",
"Ge_nder": "M"
}
},
{
"key": "Human 2",
"columns": {
"Na$me": "John",
"Description(ms/2)": "Table Number One on the Top",
"A&ge": "23",
"Ge_nder": "M"
}
}
]
}
}"""
def fix_json(resp):
output = {'highest_table': {'items': []}}
for item in resp['highest_table']['items']:
inner_dict = item['columns']
fixed_values = {'Name': inner_dict['Na$me'],
'Description(ms/2)': inner_dict['Description(ms/2)'],
'Age': inner_dict['A&ge'],
'Gender': inner_dict['Ge_nder']
}
new_inner = {'key': item['key'], 'columns': fixed_values}
output['highest_table']['items'].append(new_inner)
return output
response = json.loads(response)
fixed = fix_json(response)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.