简体   繁体   English

创建一个前N行的新列作为数组

[英]Create a new Column of previous N rows as an Array

I have a dataframe df that looks like this, 我有一个看起来像这样的数据框df,

   a       b 
0  30.05  29.55
1  30.20  26.05
2  30.81  25.65
3  31.12  26.44
.. ...    ...
85 30.84  25.65
86 31.12  26.44
87 29.55  25.57
88 32.41  25.45
89 21.55  29.57
90 32.91  26.41
91 34.12  25.69

I need to create a new Column 'c' that holds an Array of Column 'b' value plus the previous 4 rows values of Column 'b'. 我需要创建一个新列'c',其中包含列'b'值的数组加上列'b'的前4行值。 So the resulting df would look like, 因此产生的df看起来像

     a      b     c
0  30.05  29.55 [29.55,0,0,0,0]
1  30.20  26.05 [26.05,29.55,0,0,0]
2  30.81  25.65 [25.65,26.05,29.55,0,0]
3  31.12  26.44 [26.44,25.65,26.05,29.55,0]
.. ...    ...
85 30.84  25.65 [25.65, 44.60, 30.15, 29.55, 24.66 ]
86 31.12  26.44 [26.44, 25.65, 25.65, 25.65, 25.65 ]
87 29.55  25.57 [25.57, 26.44, 25.65, 25.65, 25.65 ]
88 32.41  25.45 [25.45, 25.57, 26.44, 25.65, 25.65 ]
89 21.55  29.57 [29.57, 25.45, 25.57, 26.44, 25.65 ]
90 32.91  26.41 [26.41, 29.57, 25.45, 25.57, 26.44 ]
91 34.12  25.69 [25.69, 26.41, 29.57, 25.45, 25.57 ]

I know I can access previous rows with df.b.shift(1) and df.b.shift(2) etc but I want to be able to easily change how many rows I look back to form the array with a variable rather than type out the many shift(n) 我知道我可以使用df.b.shift(1)和df.b.shift(2)等访问以前的行,但是我希望能够轻松更改我回头看的行数,以形成一个变量而不是数组输入多班次(n)

After looking all day I'm stuck. 看了一整天后,我被困住了。 (python3.6) (python3.6)

You could use pd.concat with range(N) 您可以将pd.concatrange(N)

In [60]: df['c'] = pd.concat([df.b.shift(i) for i in range(4)], 1).fillna(0).values.tolist()

In [61]: df
Out[61]:
        a      b                             c
0   30.05  29.55        [29.55, 0.0, 0.0, 0.0]
1   30.20  26.05      [26.05, 29.55, 0.0, 0.0]
2   30.81  25.65    [25.65, 26.05, 29.55, 0.0]
3   31.12  26.44  [26.44, 25.65, 26.05, 29.55]
85  30.84  25.65  [25.65, 26.44, 25.65, 26.05]
86  31.12  26.44  [26.44, 25.65, 26.44, 25.65]
87  29.55  25.57  [25.57, 26.44, 25.65, 26.44]
88  32.41  25.45  [25.45, 25.57, 26.44, 25.65]
89  21.55  29.57  [29.57, 25.45, 25.57, 26.44]
90  32.91  26.41  [26.41, 29.57, 25.45, 25.57]
91  34.12  25.69  [25.69, 26.41, 29.57, 25.45]

Or , use np.column_stack on shift(n) 或者 ,在shift(n)上使用np.column_stack

In [70]: np.column_stack([df.b.shift(i).fillna(0) for i in range(4)]).tolist()
Out[70]:
[[29.55, 0.0, 0.0, 0.0],
 [26.05, 29.55, 0.0, 0.0],
 [25.65, 26.05, 29.55, 0.0],
 [26.44, 25.65, 26.05, 29.55],
 [25.65, 26.44, 25.65, 26.05],
 [26.44, 25.65, 26.44, 25.65],
 [25.57, 26.44, 25.65, 26.44],
 [25.45, 25.57, 26.44, 25.65],
 [29.57, 25.45, 25.57, 26.44],
 [26.41, 29.57, 25.45, 25.57],
 [25.69, 26.41, 29.57, 25.45]]

You can use a conditional list comprehension (to check when the lookback is before the first value in the index). 您可以使用条件列表推导(检查回溯何时在索引中的第一个值之前)。

rows_lookback = 5

df = df.assign(c=[[df['b'].iat[n - i] if n - i >= 0 else 0 
                   for i in range(rows_lookback)] 
                  for n in range(len(df['b']))])
>>> df
        a      b                                    c
0   30.05  29.55                  [29.55, 0, 0, 0, 0]
1   30.20  26.05              [26.05, 29.55, 0, 0, 0]
2   30.81  25.65          [25.65, 26.05, 29.55, 0, 0]
3   31.12  26.44      [26.44, 25.65, 26.05, 29.55, 0]
85  30.84  25.65  [25.65, 26.44, 25.65, 26.05, 29.55]
86  31.12  26.44  [26.44, 25.65, 26.44, 25.65, 26.05]
87  29.55  25.57  [25.57, 26.44, 25.65, 26.44, 25.65]
88  32.41  25.45  [25.45, 25.57, 26.44, 25.65, 26.44]
89  21.55  29.57  [29.57, 25.45, 25.57, 26.44, 25.65]
90  32.91  26.41  [26.41, 29.57, 25.45, 25.57, 26.44]
91  34.12  25.69  [25.69, 26.41, 29.57, 25.45, 25.57]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何创建一个新的布尔列来处理前n行中的信息 - how to create a new boolean column that processes information from previous n rows 使用前面的行创建一个新列,pandas - Create a new column using the previous rows , pandas 如何从当前列中减去前一列,并使用numpy使用该值在数组中创建新维? - How to subtract the previous rows column from current column and create a new dimension in the array with this value using numpy? 具有先前行值的新列 - New column with previous rows value Pandas:在同一 ID/组内创建具有另一列前 n 行滚动总和的列 - Pandas: Create column with rolling sum of previous n rows of another column for within the same id/group 利用 pandas 到 groupby 和 select 前 N 行放入新列中的列表 - Utilizing pandas to groupby and select previous N rows to place into a list in new column Pandas:如何使用现有列和新创建列中的先前行创建新列? - Pandas : How can I create new column using previous rows from existing column and newly created column? pandas dataframe通过复制前一个数据帧的n次行并更改日期来创建新的数据帧 - pandas dataframe create a new dataframe by duplicating n times rows of the previous dataframe and change date 将前n行作为列添加到NumPy数组中 - Adding the previous n rows as columns to a NumPy array 创建新的 dataframe 列并根据同一列的先前行值生成值 - Create new dataframe column and generate values depending on the previous rows value of this same column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM