简体   繁体   English

清洗和提取 JSON 来自 Pandas 列

[英]Cleaning and Extracting JSON From Pandas columns

I have a DataFrame with the following structure:我有一个具有以下结构的 DataFrame:

id    year  name        homepage        
238   2022  Adventure  {'keywords': 'en', 'genres':[{"revenue": 1463, "name": "culture clash"}], 'runtime': 150, 'vote_average': 7}

But what I need is this structure但是我需要的是这个结构

   id    year   name        keywords    revenue     name               runtime vote_average
   238   2022  Adventure     en         1460        culture clash      150     7

How can I do this?我怎样才能做到这一点?

The idea is to json_normalize "homepage" column and join it back to df .这个想法是json_normalize “homepage” 列并将其joindf You can pass meta and the record path directly into json_normalize as parameters:您可以将元数据和记录路径作为参数直接传递到json_normalize中:

out = (df.join(pd.json_normalize(df['homepage'], record_path='genres', 
                                 meta=['keywords', 'runtime', 'vote_average']), 
               lsuffix='', rsuffix='_genres')
       .drop(columns='homepage'))

Output: Output:

    id  year       name keywords  revenue    name_genres runtime vote_average
0  238  2022  Adventure       en     1463  culture clash     150            7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM