简体   繁体   English

python,如何将单元格中的一个 dataframe 拆分为原始 dataframe?

[英]python , how split one dataframe in cells into original dataframe?

python Original dataframe ( 2 column ): python 原装 dataframe(2列):

matchNum accumulatedscore
78       [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '0', 'away': '0'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '1', 'away': '0'}]
56       [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '2', 'away': '1'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '4', 'away': '3'}]

How can I change them into original dataframe I hope...如何将它们更改为原始 dataframe 我希望...

matchNum home1 away1 home2 away2 matchNum home1 away1 home2 away2

78                 0     0     1     0
56                 2     1     4     3

It is so difficult.....太难了。。。。。。

Original dataframe:原厂dataframe:在此处输入图像描述

I hope this:我希望这:在此处输入图像描述

The simplest way- without lambdas, just only transformations:).最简单的方法 - 没有 lambda,只有转换:)。 As accumulatedscore actually contains json values.因为accumulatedscore分数实际上包含 json 值。

Preparation准备

import pandas as pd
import json

d = {
    "matchNum": [78, 56],
    "accumulatedscore":
        [
            '[{"periodvalue": "FirstHalf", "periodstatus": "ResultFinal", "home": "0", "away": "0"}, {"periodvalue": "SecondHalf", "periodstatus": "ResultFinal", "home": "1", "away": "0"}]',
            '[{"periodvalue": "FirstHalf", "periodstatus": "ResultFinal", "home": "2", "away": "1"}, {"periodvalue": "SecondHalf", "periodstatus": "ResultFinal", "home": "4", "away": "3"}]'
        ]
}
df = pd.DataFrame(d)

Solution解决方案

dfa = (
    df.join(pd.json_normalize(df["accumulatedscore"].apply(json.loads)))
        .rename(columns={0: "dict1", 1: "dict2"})
        .drop("accumulatedscore", axis=1)
)
dfb = (
    dfa.join(pd.json_normalize(dfa["dict1"]))
       .join(pd.json_normalize(dfa["dict2"]), rsuffix="2")
       .rename(columns={"home": "home1", "away": "away1"})[["matchNum", "home1", "away1", "home2", "away2"]]
)

Result结果

dfb
   matchNum home1 away1 home2 away2
0        78     0     0     1     0
1        56     2     1     4     3

Assuming your pandas DataFrame be like:假设您的 pandas DataFrame 是这样的:

d = {'matchNum': [78, 56], 
     'accumulatedscore':["[{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '0', 'away': '0'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '1', 'away': '0'}]",
                        "[{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '2', 'away': '1'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '4', 'away': '3'}]"
                        ]}

import pandas as pd
df = pd.DataFrame(d)

You can simply convert the string which has the form of a python dictionary ( refer here ):您可以简单地转换具有 python 字典形式的字符串( 请参阅此处):

import ast
df['home1']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[0]['home'] , axis=1)
df['away1']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[0]['away'], axis=1)
df['home2']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[1]['home'], axis=1)
df['away2']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[1]['away'], axis=1)
df = df.drop(columns = 'accumulatedscore')

Your df would be like:你的 df 就像:

    matchNum    home1   away1   home2   away2
0   78  0   0   1   0
1   56  2   1   4   3

You can extract only the relevant key-value pairs from each dictionary in df['accumulatedscore'] , explode the Series, convert it to a DataFrame, and combine duplicate indices:您可以仅从df['accumulatedscore']中的每个字典中提取相关的键值对,分解系列,将其转换为explode ,并组合重复索引:

df1 = (df.merge(df['accumulatedscore']
                .apply(lambda lst:tuple({'home'+str(i): d['home'], 'away'+str(i): d['away']} 
                                        for i, d in enumerate(lst, 1)))
                .explode()
                .apply(pd.Series)
                .groupby(level=0).first(), 
                left_index=True, right_index=True)
       .drop('accumulatedscore', axis=1))

Output: Output:

   matchNum home1 away1 home2 away2
0        78     0     0     1     0
1        56     2     1     4     3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM