简体   繁体   English

从 pandas json 文件中提取数据

[英]Extraction from data from pandas json file

I have the following json data file which I have converted to pandas dataframe.我有以下 json 数据文件,我已将其转换为 pandas dataframe。 The columns are as follows栏目如下

Index(['id', 'title', 'abstract', 'content', 'metadata'], dtype='object')

I am particularly interested in the column 'metadata' an element of the column looks like我对“元数据”列特别感兴趣,该列的元素看起来像

df_json.loc[78, 'metadata']
"{'classification': {'name': 'Manufacturing, Transport & Logistics'}, 'subClassification': {'name': 'Warehousing, Storage & Distribution'}, 'area': {'name': 'Southern Suburbs & Logan'}, 'location': {'name': 'Brisbane'}, 'suburb': {'name': 'Milton'}, 'workType': {'name': 'Casual/Vacation'}}"

So I want to make columns extracting the information out of 'metadata' columns for example location.所以我想让列从“元数据”列中提取信息,例如位置。 I am not sure how to extract it and put it beside the same json file with added columns such as location etc.我不确定如何提取它并将其放在同一个 json 文件旁边,并添加了诸如位置等列。

    id  title   abstract    content metadata    clean_content
0   38915469    Recruitment Consultant  We are looking for someone to focus purely on ...   <HTML><p>Are you looking to join a thriving bu...   {'standout': {'bullet1': 'Join a Sector that i...   Are you looking to join a thriving business th...
1   38934839    Computers Salesperson - Coburg  Passionate about exceptional customer service?...   <HTML><p>&middot;&nbsp;&nbsp;Casual hours as r...   {'additionalSalaryText': 'Attractive Commissio...   middotnbspnbspCasual hours as required transit...
2   38946054    Senior Developer | SA   Readifarians are known for discovering the lat...   <HTML><p>Readify helps organizations 

 you can use pandas.json_normalize 

Applying on your string应用于你的字符串

 pd.json_normalize(eval(json_string)) 

 #o/p

在此处输入图像描述

if this is work for you, than simply you can try如果这对您有用,那么您可以尝试

 df["metadata"].apply(lambda x: pd.json_normalize(eval(x)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM