繁体   English   中英

如何将此json多列表写入python pandas中的单独列中

[英]how to write this json multi-lists into seperate columns in python pandas

我有这个JSON文件:

{"a": [{"Name": "name1",
"number": "number1",
"defaultPrice": {"p": "232", "currency": "CAD"},
"prices": {"DZ": {"p": "62", "currency": "RMB"},
 "AU": {"p": "73", "currency": "AUD"},
"lg": "en"}},
{"Name": "name2",
"number": "number2",
 "defaultPrice": {"p": "233", "currency": "CAD"},
 "prices": {"DZ": {"p": "63", "currency": "RMB"},
 "US": {"p": "72", "currency": "USD"},
 "Lg": "en"}}]}

现在我得到带有名称,编号,默认价格,价格的表,但是prices列就像三行,需要从键p "p": "63", "currency": "RMB".读取价格63 "p": "63", "currency": "RMB".

但是我希望在单独的列中得到一个包含价格和货币的表,我使用了以下方法:

ndf = pd.concat([x的价格为pd.Series(x),轴= 1)

但是只是得到一个错误的答案:

 0                                                  1
 DZ           {"p": "232", "currency": "CAD"}  {"p": "62", "currency": "RMB"}
 AU           {"p": "233", "currency": "CAD"}    {"p": "63","currency":"RMB"}

无论如何要纠正这一点,以便我可以获得预期的输出?

Name    Number   Code  currency
name1   number1   AU    AUD      
name1   number1   DZ    RMB      

非常感谢!!

您可以在defaultPrice列上使用apply(pd.Series)将其拆分为单独的列,然后将其重新连接到原始数据框。

prices = {"a": [{"Name": "name1",
"number": "number1",
"defaultPrice": {"p": "232", "currency": "CAD"},
"prices": {"DZ": {"p": "62", "currency": "RMB"},
 "AU": {"p": "73", "currency": "AUD"},
"lg": "en"}},
{"Name": "name2",
"number": "number2",
 "defaultPrice": {"p": "233", "currency": "CAD"},
 "prices": {"DZ": {"p": "63", "currency": "RMB"},
 "US": {"p": "72", "currency": "USD"},
 "Lg": "en"}}]}

ndf = pd.DataFrame(prices['a'])
pd.concat([ndf, ndf['defaultPrice'].apply(pd.Series)], axis=1).drop('defaultPrice', axis=1)

但是,您的prices列仍然是词典列表。 但是由于您没有提到要如何处理,所以我将其保留为原样(不包括在输出中)。

输出:

Name    number  p   currency
name1   number1 232 CAD
name2   number2 233 CAD

json字符串:

j = {"a": [{ "Name": "name1",
             "number": "number1",
             "defaultPrice":  {"p": "232", "currency": "CAD"},
             "prices": {"DZ": {"p": "62", "currency": "RMB"},
                        "AU": {"p": "73", "currency": "AUD"},
                        "lg": "en"
                       }
             },
            {"Name": "name2",
             "number": "number2",
             "defaultPrice":  {"p": "233", "currency": "CAD"},
             "prices": {"DZ": {"p": "63", "currency": "RMB"},
                        "US": {"p": "72", "currency": "USD"},
                        "Lg": "en"
                       }
            }
          ]}

获得所需输出的代码:

country_codes = set()
for d in j['a']:
  c = d['prices'].keys()
  country_codes.update(c)

country_codes = sorted([i for i in country_codes if not i in ['lg','Lg']])
country_codes

meta = ['Name','number'] + [['prices',c,'p'] for c in country_codes] + [['prices',c,'currency'] for c in country_codes] 

df = json_normalize(j['a'], record_path = 'prices', meta = meta,errors='ignore')
df = df.rename(columns={0: 'countryCode'})
df = df[~df['countryCode'].isin(['lg','Lg'])]

for idx, row in df.iterrows():
    country = row['countryCode']
    col_price = df.columns[df.columns.str.contains(country+'.p')][0]
    col_currency = df.columns[df.columns.str.contains(country+'.currency')][0]
    price = row[col_price]
    currency = row[col_currency]
    df.loc[idx,'price'] = price
    df.loc[idx,'currency'] = currency

df = df[['Name','number','countryCode', 'currency', 'price']]


df

这给出:

    Name   number countryCode currency price
0  name1  number1          DZ      RMB    62
1  name1  number1          AU      AUD    73
3  name2  number2          DZ      RMB    63
4  name2  number2          US      USD    72

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM