[英]Trying to create a new dataframe column in pandas based on a dataframe related if statement
I'm learning Python & pandas and practicing with different stock calculations. 我正在学习Python和熊猫,并使用不同的股票计算进行练习。 I've tried to search help with this but just haven't found a response similar enough or then didn't understand how to deduce the correct approach based on the previous responses. 我试图在此方面寻求帮助,但只是没有找到足够类似的响应,或者不了解如何根据先前的响应推论出正确的方法。
I have read stock data of a given time frame with datareader into dataframe df. 我已使用datareader将给定时间范围的库存数据读取到dataframe df中。 In df I have Date Volume and Adj Close columns which I want to use to create a new column "OBV" based on given criteria. 在df中,我要使用“日期量”和“调整结束”列,以根据给定的条件来创建新列“ OBV”。 OBV is a cumulative value that adds or subtracts the value of the volume today to the previous' days OBV depending on the adjusted close price. OBV是一个累积值,根据调整后的收盘价,它在前几天的OBV中增加或减去今天的交易量。
The calculation of OBV is simple: OBV的计算很简单:
If Adj Close is higher today than Adj Close of yesterday then add the Volume of today to the (cumulative) volume of yesterday. 如果今天的收市价高于昨天的收市价,则将今天的交易量添加到昨天的(累计)交易量中。
If Adj Close is lower today than Adj Close of yesterday then substract the Volume of today from the (cumulative) volume of yesterday. 如果今天的收市价低于昨天的收市价,则从昨天的(累积)量中减去今天的量。
On day 1 the OBV = 0 在第1天,OBV = 0
This is then repeated along the time frame and OBV gets accumulated. 然后沿着时间范围重复此操作,并累积OBV。
Here's the basic imports and start 这是基本的导入和开始
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
from pandas_datareader import data, wb
start = datetime.date(2012, 4, 16)
end = datetime.date(2017, 4, 13)
# Reading in Yahoo Finance data with DataReader
df = data.DataReader('GOOG', 'yahoo', start, end)
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
#This is what I cannot get to work, and I've tried two different ways.
#ATTEMPT1
def obv1(column):
if column["Adj Close"] > column["Adj close"].shift(-1):
val = column["Volume"].shift(-1) + column["Volume"]
else:
val = column["Volume"].shift(-1) - column["Volume"]
return val
df["OBV"] = df.apply(obv1, axis=1)
#ATTEMPT 2
def obv1(df):
if df["Adj Close"] > df["Adj close"].shift(-1):
val = df["Volume"].shift(-1) + df["Volume"]
else:
val = df["Volume"].shift(-1) - df["Volume"]
return val
df["OBV"] = df.apply(obv1, axis=1)
Both give me an error. 两者都给我一个错误。
Consider the dataframe df
考虑数据帧df
np.random.seed([3,1415])
df = pd.DataFrame(dict(
Volume=np.random.randint(100, 200, 10),
AdjClose=np.random.rand(10)
))
print(df)
AdjClose Volume
0 0.951710 111
1 0.346711 198
2 0.289758 174
3 0.662151 190
4 0.171633 115
5 0.018571 155
6 0.182415 113
7 0.332961 111
8 0.150202 113
9 0.810506 126
Multiply the Volume
by -1 when change in AdjClose
is negative. 当AdjClose
变化为负时,将Volume
乘以-1。 Then cumsum
然后cumsum
(df.Volume * (~df.AdjClose.diff().le(0) * 2 - 1)).cumsum()
0 111
1 -87
2 -261
3 -71
4 -186
5 -341
6 -228
7 -117
8 -230
9 -104
dtype: int64
Include this along side the rest of the df
在df
的其余部分中包括此内容
df.assign(new=(df.Volume * (~df.AdjClose.diff().le(0) * 2 - 1)).cumsum())
AdjClose Volume new
0 0.951710 111 111
1 0.346711 198 -87
2 0.289758 174 -261
3 0.662151 190 -71
4 0.171633 115 -186
5 0.018571 155 -341
6 0.182415 113 -228
7 0.332961 111 -117
8 0.150202 113 -230
9 0.810506 126 -104
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.