簡體   English   中英

使用 python pandas 將 JSON 文件轉換為正確的格式

[英]Convert JSON file into proper format using python pandas

我想將 JSON 文件轉換為正確的格式。 我有一個 JSON 文件,如下所示:

{
    "fruit": "Apple",
    "size": "Large",
    "color": "Red",
    "details":"|seedless:true|,|condition:New|"

},
{
    "fruit": "Almond",
    "size": "small",
    "color": "brown",
    "details":"|Type:dry|,|seedless:true|,|condition:New|"

}

您可以在詳細信息中看到數據可能會有所不同。

我想把它改成:

{
    "fruit": "Apple",
    "size": "Large",
    "color": "Red",
    "seedless":"true",
    "condition":"New",

},
{
    "fruit": "Almond",
    "size": "small",
    "color": "brown",
    "Type":"dry",
    "seedless":"true",
    "condition":"New",

}

我曾嘗試使用 Pandas 在 python 中這樣做:

import json
import pandas as pd
import re
df = pd.read_json("data.json",lines=True)

#I tried to change the pattern of data in details column as

re1 = re.compile('r/|(.?):(.?)|/')
re2 = re.compile('r\"(.*?)\":\"(.*?)\"')

df.replace({'details' :re1}, {'details' : re2},inplace = True, regex = True);

但是,在詳細信息列的所有行中都將輸出作為“對象”。

嘗試這個,

for d in data:
    details = d.pop('details')
    d.update(dict(x.split(":") for x in details.split("|") if ":" in x))

print(data)

[{'color': 'Red',
  'condition': 'New',
  'fruit': 'Apple',
  'seedless': 'true',
  'size': 'Large'},
 {'Type': 'dry',
  'color': 'brown',
  'condition': 'New',
  'fruit': 'Almond',
  'seedless': 'true',
  'size': 'small'}]

您可以將(列表)字典轉換為 Pandas 數據框。

import pandas as pd

# data is a list of dictionaries
data = [{
    "fruit": "Apple",
    "size": "Large",
    "color": "Red",
    "details":"|seedless:true|,|condition:New|"

},
{
    "fruit": "Almond",
    "size": "small",
    "color": "brown",
    "details":"|Type:dry,|seedless:true|,|condition:New|"

}]

# convert to data frame
df = pd.DataFrame(data)

# remove '|' from details and convert to list
df['details'] = df['details'].str.replace(r'\|', '').str.split(',')

# explode list => one row for each element
df = df.explode('details')

# split details into name/value pair
df[['name', 'value']] = df['details'].str.split(':').apply(lambda x: pd.Series(x))

# drop details column
df = df.drop(columns='details')

print(df)

    fruit   size  color       name value
0   Apple  Large    Red   seedless  true
0   Apple  Large    Red  condition   New
1  Almond  small  brown       Type   dry
1  Almond  small  brown   seedless  true
1  Almond  small  brown  condition   New

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM