简体   繁体   English

pandas dataframe 条件人口新列

[英]pandas dataframe conditional population of a new column

I am working on manipulation of a column(Trend) in pandas DataFrame. Below is my source DataFrame. Currently I have set it to 0.我正在处理 pandas DataFrame 中的列(趋势)。下面是我的源 DataFrame。目前我已将其设置为 0。

在此处输入图像描述

The logic I want to use to populate Trend column is below我想用来填充趋势列的逻辑如下

  1. if df['Close'] > df.shift(1)['Down'] then 1如果 df['Close'] > df.shift(1)['Down'] 那么 1

  2. if df['Close'] < df.shift(1)['Up'] then -1如果 df['Close'] < df.shift(1)['Up'] 那么 -1

  3. if any one of the above condition does not meet then, df.shift(1)['Trend'].如果上述任何一个条件不满足,则 df.shift(1)['Trend']。 if this value is NaN then set it to 1.如果此值为 NaN,则将其设置为 1。

Above code in plainText,以上纯文本代码,

  1. if current close is greater then previous row value of Down column then 1如果当前收盘价大于Down列的前一行值,则为 1
  2. if current close is less than previous row value of Up column then -1如果当前收盘价小于Up列的前一行值,则 -1
  3. if any one of those conditions does not meet, then set previous row value of Trend column as long as its not NaN .如果这些条件中的任何一个不满足,则设置Trend列的前一行值,只要它不是 NaN if its NaN then set to 1如果它的 NaN 则设置为 1

UPDATE更新

Data as text文本形式的数据

   Close        Up      Down  Trend
   3.138       NaN       NaN      0
   3.141       NaN       NaN      0
   3.141       NaN       NaN      0
   3.130       NaN       NaN      0
   3.110       NaN       NaN      0
   3.130  3.026432  3.214568      0
   3.142  3.044721  3.214568      0
   3.140  3.047010  3.214568      0
   3.146  3.059807  3.214568      0
   3.153  3.064479  3.214568      0
   3.173  3.080040  3.214568      0
   3.145  3.080040  3.214568      0
   3.132  3.080040  3.214568      0
   3.131  3.080040  3.209850      0
   3.141  3.080040  3.209850      0
   3.098  3.080040  3.205953      0
   3.070  3.080040  3.195226      0

Expected output预计 output

在此处输入图像描述

We could use numpy.select to select values depending on which condition is satisfied.根据满足的条件,我们可以使用numpy.select到 select 的值。 Then pass the outcome of numpy.select to fillna to fill in missing "Trend" values with it (this is used to not lose existing "Trend" values).然后将numpy.select的结果传递给fillna以用它填充缺失的“趋势”值(这用于不丢失现有的“趋势”值)。 Then since NaN trend values must be filled with previous "Trend" value, we use ffill and fill the remaining NaN values with 1.然后由于 NaN 趋势值必须用之前的“趋势”值填充,我们使用ffill并用 1 填充剩余的 NaN 值。

import numpy as np
df['Trend'] = (df['Trend'].replace(0, np.nan)
               .fillna(pd.Series(np.select([df['Close'] > df['Down'].shift(), 
                                            df['Close'] < df['Up'].shift()],
                                           [1, -1], np.nan), index=df.index))
               .ffill().fillna(1))

Output: Output:

    Close        Up      Down  Trend
0   3.138       NaN       NaN    1.0
1   3.141       NaN       NaN    1.0
2   3.141       NaN       NaN    1.0
3   3.130       NaN       NaN    1.0
4   3.110       NaN       NaN    1.0
5   3.130  3.026432  3.214568    1.0
6   3.142  3.044721  3.214568    1.0
7   3.140  3.047010  3.214568    1.0
8   3.146  3.059807  3.214568    1.0
9   3.153  3.064479  3.214568    1.0
10  3.173  3.080040  3.214568    1.0
11  3.145  3.080040  3.214568    1.0
12  3.132  3.080040  3.214568    1.0
13  3.131  3.080040  3.209850    1.0
14  3.141  3.080040  3.209850    1.0
15  3.098  3.080040  3.205953    1.0
16  3.070  3.080040  3.195226   -1.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM