如何在 Python 中对访问列的每一行的循环进行矢量化

Question

I have a large data set with over 350,000 rows.我有一个超过 350,000 行的大型数据集。 Now one of the columns contains a single value dictionary as its row values, and I was to assign each unique key as a new column in the data frame and the value as the row value in the right [row, column] position.现在其中一列包含一个值字典作为其行值，我将每个唯一键分配为数据框中的新列，并将值分配为右侧 [行，列] position 中的行值。

Here's the code I was hoping to use but due to a large number of rows it's taking too long.这是我希望使用的代码，但由于行数过多，它花费的时间太长。

idx = 0
 for row in df['value']:
    for key in row:
        if key not in df.columns.tolist():
            df[key] = 0
            df.loc[idx,key] = row[key]
        else:
            df.loc[idx,key] = row[key]
    idx += 1

Here's the sample data这是示例数据

import pandas as pd

df = pd.DataFrame({'time': [12,342,786],
                   'event': ['offer received', 'transaction', 'offer viewed'],
                   'value': [{'offer id': '2906b810c7d4411798c6938adc9daaa5'}, {'amount': 0.35000000000000003},
                            {'offer id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}]
                   })

Here the expected output:这里预期的 output：

df2 = pd.DataFrame({'time': [12,342,786],
               'event': ['offer received', 'transaction', 'offer viewed'],
               'value': [{'offer id': '2906b810c7d4411798c6938adc9daaa5'}, {'amount': 0.35000000000000003},
                        {'offer id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}],
               'offer id': ['2906b810c7d4411798c6938adc9daaa5', 0, '0b1e1539f2cc45b7b9fa7c272da2e1d7' ],
               'amount': [0, 0.35000000000000003, 0]
               })

Answer 1

df["offer id"] = df["value"].apply(lambda d: d["offer id"] if "offer id" in d else 0)
df["amount"] = df["value"].apply(lambda d: d["amount"] if "amount" in d else 0)

Output: Output：

 time           event                                             value  \
0    12  offer received  {'offer id': '2906b810c7d4411798c6938adc9daaa5'}   
1   342     transaction                   {'amount': 0.35000000000000003}   
2   786    offer viewed  {'offer id': '0b1e1539f2cc45b7b9fa7c272da2e1d7'}   

                           offer id  amount  
0  2906b810c7d4411798c6938adc9daaa5    0.00  
1                                 0    0.35  
2  0b1e1539f2cc45b7b9fa7c272da2e1d7    0.00

如何在 Python 中对访问列的每一行的循环进行矢量化

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-04-29 01:51:32

如何在 Python 中对访问列的每一行的循环进行矢量化

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-04-29 01:51:32

解决方案1
1 已采纳 2020-04-29 01:51:32