简体   繁体   English

Pandas 迭代地追加来自多个 DataFrame 列的行值

[英]Pandas iteratively append row values from multiple DataFrame columns

I want to iteratively append row values from multiple columns to a new column in a new DataFrame based on a group.我想迭代地将多列中的行值附加到基于组的新 DataFrame 中的新列。

My goal is to have 1 row for each customer, with 1 column for the customer's ID and 1 column for their timeline that lists the date of each event followed by the event description, for all dates and events, in chronological order.我的目标是为每个客户设置 1 行,其中 1 列用于客户 ID,1 列用于他们的时间线,列出每个事件的日期,然后是事件描述,所有日期和事件按时间顺序排列。

I have solved this with a series of dictionaries.我已经用一系列字典解决了这个问题。 I am searching for a clean, elegant, pandas-style way to accomplish this as this code will be run frequently with small changes to customers, events, etc.我正在寻找一种干净、优雅、熊猫风格的方式来实现这一点,因为此代码将频繁运行,对客户、事件等进行小的更改。

Example:例子:

import pandas as pd

df_have = pd.DataFrame({'Customer_ID':['customer_1','customer_1','customer_1','customer_2','customer_2'], 
                        'Event':['purchased cornflakes','purchased eggs', 'purchased waffles','sold eggs','purchased cows'],
                           'Date':['2011-06-16','2011-06-13','2011-06-09','2011-06-13','2011-06-18']})

df_have['Date'] = pd.to_datetime(df_have['Date'])

df_have.sort_values(['Customer_ID','Date'], inplace =True)
df_have

df 我目前有

df_want = pd.DataFrame({'Customer_ID':['customer_1','customer_2'],
                       'Time_Line':[['2011-06-09,purchased waffles,2011-06-13,purchased eggs,2011-06-16,purchased cornflakes'],
                                   ['2011-06-13,sold eggs,2011-06-18,purchased cows']]})
df_want

df 我想要

Steps:脚步:

1) Set Customer_ID to be the index axis as it would remain static throughout the operation. 1) 将Customer_ID设置为索引轴,因为它将在整个操作过程中保持静态。

2) stack so that Date and Event fall below one another. 2) stack以便DateEvent低于彼此。

3) Peform groupby wrt the index ( level=0 ) and convert the only column into list . 3)通过索引( level=0 )执行groupby并将唯一的列转换为list Since we've stacked them in this sequence, they would appear alternatingly.由于我们已按此顺序堆叠它们,因此它们会交替出现。


# set maximum width of columns to be displayed
pd.set_option('max_colwidth', 100)

df_have.set_index('Customer_ID').stack(
    ).groupby(level=0).apply(list).reset_index(name="Time_Line")

在此处输入图片说明


To change the order in which sequence occurs inside the list :要更改序列在list出现的顺序:

df_have.set_index('Customer_ID').reindex_axis(['Event', 'Date'], axis=1).stack(
    ).groupby(level=0).apply(list).reset_index(name="Time_Line")

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何迭代地将多个列从一个数据帧追加到另一个? - how to iteratively append multiple columns from one dataframe to another? 从多个列中列出pandas数据帧行值 - Making a list of pandas dataframe row values from multiple columns 迭代地绘制来自熊猫数据框的值 - Iteratively plot values from a pandas dataframe 通过具有多个句柄的Pandas DataFrame迭代,并迭代地追加已编辑的行? - Iterate through a Pandas DataFrame with multiple handles, and iteratively append edited rows? 迭代命名 pandas DataFrame 中的列? - Iteratively naming columns in a pandas DataFrame? Pandas DataFrame-将月份转换为日期时间,并从多个列中反复选择数据进行绘图 - Pandas DataFrame - convert months to datetime and iteratively select data from multiple columns for plotting Pandas:append 行到 DataFrame,列中有多个索引 - Pandas: append row to DataFrame with multiindex in columns Append 一行到 pandas dataframe 仅用于 Z99938282F04071859941E18F16EFZ4 列 - Append a row to pandas dataframe for select columns only Pandas Dataframe - Append 数据帧(多列/行 + 多列,单行) - Pandas Dataframe - Append dataframes (Multiple columns/rows + Multiple columns, single row) 有没有一种方法可以将Pandas DataFrame行中的值列表转换为多列? - Is there a way to convert list of values in Pandas DataFrame row to multiple columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM