[英]Flatten nested JSON into pandas dataframe columns
I have a pandas column with nested json data string.我有一个 pandas 列,其中包含嵌套的 json 数据字符串。 I'd like to flatten the data into multiple pandas columns.
我想将数据展平为多个 pandas 列。
Here's data from a single cell:这是来自单个单元格的数据:
rent['ques'][9] = "{'Rent': [{'Name': 'Asking', 'Value': 16.07, 'Unit': 'Usd'}], 'Vacancy': {'Name': 'Vacancy', 'Value': 25.34100001, 'Unit': 'Pct'}}"
For each cell in pandas column, I'd like parse this string and create multiple columns.对于 pandas 列中的每个单元格,我想解析这个字符串并创建多个列。 Expected output looks something like this:
预期的 output 看起来像这样:
When I run, json_normalize(rent['ques'])
, I receive the following error.当我运行
json_normalize(rent['ques'])
时,我收到以下错误。
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-28-cebc86357f34> in <module>()
----> 1 json_normalize(rentoff['Survey'])
/anaconda3/lib/python3.7/site-packages/pandas/io/json/normalize.py in json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep)
196 if record_path is None:
197 if any([[isinstance(x, dict)
--> 198 for x in compat.itervalues(y)] for y in data]):
199 # naive normalization, this is idempotent for flat records
200 # and potentially will inflate the data considerably for
/anaconda3/lib/python3.7/site-packages/pandas/io/json/normalize.py in <listcomp>(.0)
196 if record_path is None:
197 if any([[isinstance(x, dict)
--> 198 for x in compat.itervalues(y)] for y in data]):
199 # naive normalization, this is idempotent for flat records
200 # and potentially will inflate the data considerably for
/anaconda3/lib/python3.7/site-packages/pandas/compat/__init__.py in itervalues(obj, **kw)
210
211 def itervalues(obj, **kw):
--> 212 return iter(obj.values(**kw))
213
214 next = next
AttributeError: 'str' object has no attribute 'values'
Try this:尝试这个:
df['quest'] = df['quest'].str.replace("'", '"')
dfs = []
for i in df['quest']:
data = json.loads(i)
dfx = pd.json_normalize(data, record_path=['Rent'], meta=[['Vacancy', 'Name'], ['Vacancy', 'Unit'], ['Vacancy', 'Value']])
dfs.append(dfx)
df = pd.concat(dfs).reset_index(drop=['index'])
print(df)
Name Value Unit Vacancy.Name Vacancy.Unit Vacancy.Value
0 Asking 16.07 Usd Vacancy Pct 25.341
1 Asking 16.07 Usd Vacancy Pct 25.341
2 Asking 16.07 Usd Vacancy Pct 25.341
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.