使用python中的带宽方法重新平衡组合

Question

We need to calculate a continuously rebalanced portfolio of 2 stocks. 我们需要计算不断重新平衡的2只股票组合。 Lets call them A and B. They shall both have an equal part of the portfolio. 让我们称他们为A和B.他们都应该有相同的投资组合部分。 So if I have 100$ in my portfolio 50$ get invested in A and 50$ in B. As both stocks perform very differently they will not keep their equal weights (after 3 month already A may be worth 70$ while B dropped to 45$). 因此，如果我在我的投资组合中有100美元，则50美元投资A和50美元投资。由于两只股票的表现差异很大，他们不会保持相同的权重（3个月后A已经可能值70美元而B降至45 $）。 The problem is that they have to keep their share of the portfolio within a certain bandwidth of tolerance. 问题是他们必须将他们在投资组合中的份额保持在一定的容忍带宽内。 This bandwidth is 5%. 此带宽为5％。 So I need a function that does: If A > B*1.05 or A*1.05 < B then rebalance. 所以我需要一个功能：如果A> B * 1.05或A * 1.05 <B则重新平衡。

This first part serves only to get the fastest way some data to have a common basis of discussion and to make results comparable, so you can just copy and paste this whole code and it works for you.. 第一部分仅用于以最快的方式获得一些数据以具有共同的讨论基础并使结果具有可比性，因此您只需复制并粘贴整个代码即可。

import pandas as pd
from datetime import datetime
import numpy as np


df1 = pd.io.data.get_data_yahoo("IBM", 
                                start=datetime(1970, 1, 1), 
                                end=datetime.today())
df1.rename(columns={'Adj Close': 'ibm'}, inplace=True)

df2 = pd.io.data.get_data_yahoo("F", 
                                start=datetime(1970, 1, 1), 
                                end=datetime.today())
df2.rename(columns={'Adj Close': 'ford'}, inplace=True)

df = df1.join(df2.ford, how='inner')
del df["Open"]
del df["High"]
del df["Low"]
del df["Close"]
del df["Volume"]

Nowe start to calculate the relative performance of each stock with the formula: df.ibm/df.ibm[0]. 现在开始用公式计算每种股票的相对表现：df.ibm / df.ibm [0]。 The problem is that as soon as we break the first bandwidth, we need to reset the 0 in our formula: df.ibm/df.ibm[0], since we rebalance and need to start calculating from that point on. 问题是，一旦我们打破第一个带宽，我们需要在公式中重置0：df.ibm / df.ibm [0]，因为我们重新平衡并需要从该点开始计算。 So we use df.d for this placeholder function and set it equal to df.t as soon as a bandwidth gets broken df.t basically just counts the length of the dataframe and can tell us therefore always “where we are”. 因此，我们将df.d用于此占位符函数，并在带宽被破坏时将其设置为等于df.t df.t基本上只计算数据帧的长度，因此可以告诉我们“我们在哪里”。 So here the actual calculation starts: 所以这里开始实际计算：

tol = 0.05 #settintg the bandwidth tolerance
df["d"]= 0 # 
df["t"]= np.arange(len(df))
tol = 0.3

def flex_relative(x):
    if df.ibm/df.ibm.iloc[df.d].values < df.ford/df.ford.iloc[df.d].values * (1+tol):
        return  df.iloc[df.index.get_loc(x.name) - 1]['d'] == df.t
    elif df.ibm/df.ibm.iloc[df.d].values > df.ford/df.ford.iloc[df.d].values * (1+tol):
        return df.iloc[df.index.get_loc(x.name) - 1]['d'] == df.t
    else:
        return df.ibm/df.ibm.iloc[df.d].values, df.ford/df.ford.iloc[df.d].values



df["ibm_performance"], df["ford_performance"], = df.apply(flex_relative, axis =1)

The problem is, that I am getting this error form the last line of code, where I try to apply the function with df.apply(flex_relative, axis =1) 问题是，我从最后一行代码中得到这个错误，我尝试用df.apply(flex_relative, axis =1)应用这个函数df.apply(flex_relative, axis =1)

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 1972-06-01 00:00:00') The problem is that none of the given options of the error statement solves my problem, so I really don't know what to do... ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 1972-06-01 00:00:00')问题是错误语句的给定选项都没有解决我的问题，所以我真的不知道该怎么做......

The only thing I found so far was the link below, but calling a R function won't work for me because I need to apply that to quite big datasets and I may also implement an optimization in this function, so it definitely needs to be built in python. 到目前为止我唯一发现的是下面的链接，但调用R函数对我来说不起作用，因为我需要将它应用于相当大的数据集，我也可以在这个函数中实现优化，所以它肯定需要是内置于python中。 Here is the link anyway: Finance Lib with portfolio optimization method in python 无论如何，这里是链接：财务Lib与python中的投资组合优化方法

Manually (what is not a good way to handle big data), I calculated that the first date for a rebalancing would be: 03.11.1972 00:00:00 手动（什么不是处理大数据的好方法），我计算出重新平衡的第一个日期是： 03.11.1972 00:00:00

The output of the dataframe at the first rebalancing should look like this: 第一次重新平衡时数据框的输出应如下所示：

                     ibm        ford        d   t   ibm_performance ford_performance
1972-11-01 00:00:00 6,505655    0,387415    0   107 1,021009107 0,959552418
1972-11-02 00:00:00 6,530709    0,398136    0   108 1,017092172 0,933713605
1972-11-03 00:00:00 6,478513    0,411718    0   109 1,025286667 0,902911702 # this is the day, the rebalancing was detected
1972-11-06 00:00:00 6,363683    0,416007    109 110 1,043787536 0,893602752 # this is the day the day the rebalancing is implemented, therefore df.d gets set = df.t = 109
1972-11-08 00:00:00 6,310883    0,413861    109 111 1,052520384 0,898236364
1972-11-09 00:00:00 6,227073    0,422439    109 112 1,066686226 0,879996875

Thanks a lot for your support! 非常感谢你的支持！

@Alexander: Yes, the rebalancing will take place the following day. @Alexander：是的，重新平衡将在第二天进行。

@maxymoo: If you implement this code after yours, you get the portfolio weights of each stock and they don't rest between 45 and 55%. @maxymoo：如果您在此之后实施此代码，您将获得每个股票的投资组合权重，并且它们不会在45％到55％之间。 It's rather between 75% and 25%: 它相当于75％到25％之间：

df["ford_weight"] = df.ford_prop*df.ford/(df.ford_prop*df.ford+df.ibm_prop*df.ibm) #calculating the actual portfolio weights
df["ibm_weight"] = df.ibm_prop*df.ibm/(df.ford_prop*df.ford+df.ibm_prop*df.ibm)

print df
print df.ibm_weight.min()
print df.ibm_weight.max()
print df.ford_weight.min()
print df.ford_weight.max()

I tried no for an hour or so to fix, but didn't find it. 我试了一个小时左右才修好，但没找到。

Can I do anything to make this question clearer? 我可以做些什么来使这个问题更清楚吗？

Answer 1

The main idea here is to work in terms of dollars instead of ratios. 这里的主要思想是以美元而不是比率来工作。 If you keep track of the number of shares and the relative dollar values of the ibm and ford shares, then you can express the criterion for rebalancing as 如果您跟踪ibm和福特股票的股票数量和相对美元价值，那么您可以将重新平衡的标准表达为

mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)

where the ratio equals 比率等于的地方

    df['ratio'] = df['ibm value'] / df['ford value']

and df['ibm value'] , and df['ford value'] represent actual dollar values. 和df['ibm value']和df['ford value']代表实际的美元价值。

import datetime as DT
import numpy as np
import pandas as pd
import pandas.io.data as PID

def setup_df():
    df1 = PID.get_data_yahoo("IBM", 
                             start=DT.datetime(1970, 1, 1), 
                             end=DT.datetime.today())
    df1.rename(columns={'Adj Close': 'ibm'}, inplace=True)

    df2 = PID.get_data_yahoo("F", 
                             start=DT.datetime(1970, 1, 1), 
                             end=DT.datetime.today())
    df2.rename(columns={'Adj Close': 'ford'}, inplace=True)

    df = df1.join(df2.ford, how='inner')
    df = df[['ibm', 'ford']]
    df['sh ibm'] = 0
    df['sh ford'] = 0
    df['ibm value'] = 0
    df['ford value'] = 0
    df['ratio'] = 0
    return df

def invest(df, i, amount):
    """
    Invest amount dollars evenly between ibm and ford
    starting at ordinal index i.
    This modifies df.
    """
    c = dict([(col, j) for j, col in enumerate(df.columns)])
    halfvalue = amount/2
    df.iloc[i:, c['sh ibm']] = halfvalue / df.iloc[i, c['ibm']]
    df.iloc[i:, c['sh ford']] = halfvalue / df.iloc[i, c['ford']]

    df.iloc[i:, c['ibm value']] = (
        df.iloc[i:, c['ibm']] * df.iloc[i:, c['sh ibm']])
    df.iloc[i:, c['ford value']] = (
        df.iloc[i:, c['ford']] * df.iloc[i:, c['sh ford']])
    df.iloc[i:, c['ratio']] = (
        df.iloc[i:, c['ibm value']] / df.iloc[i:, c['ford value']])

def rebalance(df, tol, i=0):
    """
    Rebalance df whenever the ratio falls outside the tolerance range.
    This modifies df.
    """
    c = dict([(col, j) for j, col in enumerate(df.columns)])
    while True:
        mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
        # ignore prior locations where the ratio falls outside tol range
        mask[:i] = False
        try:
            # Move i one index past the first index where mask is True
            # Note that this means the ratio at i will remain outside tol range
            i = np.where(mask)[0][0] + 1
        except IndexError:
            break
        amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])
        invest(df, i, amount)
    return df

df = setup_df()
tol = 0.05
invest(df, i=0, amount=100)
rebalance(df, tol)

df['portfolio value'] = df['ibm value'] + df['ford value']
df['ibm weight'] = df['ibm value'] / df['portfolio value']
df['ford weight'] = df['ford value'] / df['portfolio value']

print df['ibm weight'].min()
print df['ibm weight'].max()
print df['ford weight'].min()
print df['ford weight'].max()

# This shows the rows which trigger rebalancing
mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)
print(df.loc[mask])

Answer 2

You can use this code to calulate your portfolio at each point in time. 您可以使用此代码在每个时间点计算您的投资组合。

i = df.index[0]
df['ibm_prop'] = 0.5/df.ibm.ix[i]
df['ford_prop'] = 0.5/df.ford.ix[i]

while i:
   try:
      i =  df[abs(1-(df.ibm_prop*df.ibm + df.ford_prop*df.ford)) > tol].index[0]
   except IndexError:
      break
   df['ibm_prop'].ix[i:] = 0.5/df.ibm.ix[i]
   df['ford_prop'].ix[i:] = 0.5/df.ford.ix[i]

Answer 3

just a mathematical improvement on maxymoo's answer: 只是对maxymoo答案的数学改进：

i = df.index[0]
df['ibm_prop'] = df.ibm.ix[i]/(df.ibm.ix[i]+df.ford.ix[i])
df['ford_prop'] = df.ford.ix[i]/(df.ibm.ix[i]+df.ford.ix[i])

while i:
   try:
      i =  df[abs((df.ibm_prop*df.ibm - df.ford_prop*df.ford)) > tol].index[0]
   except IndexError:
      break
   df['ibm_prop'].ix[i:] = df.ibm.ix[i]/(df.ibm.ix[i]+df.ford.ix[i])
   df['ford_prop'].ix[i:] = df.ford.ix[i]/(df.ibm.ix[i]+df.ford.ix[i])

Answer 4

What about this: 那这个呢：

df["d"]= [0,0,0,0,0,0,0,0,0,0]
df["t"]= np.arange(len(df))
tol = 0.05

def flex_relative(x):
    if df.ibm/df.ibm.iloc[df.d].values < df.ford/df.ford.iloc[df.d].values * (1+tol):
        return  df.iloc[df.index.get_loc(x.name) - 1]['d'] == df.t
    elif df.ibm/df.ibm.iloc[df.d].values > df.ford/df.ford.iloc[df.d].values * (1+tol):
        return df.iloc[df.index.get_loc(x.name) - 1]['d'] == df.t

使用python中的带宽方法重新平衡组合

问题描述

4 个解决方案

解决方案1
8 已采纳 2015-06-12 13:24:32

解决方案2
3 2015-06-10 02:33:22

解决方案3
2 2015-06-12 06:17:18

解决方案4
2 2015-06-12 06:56:04

使用python中的带宽方法重新平衡组合

问题描述

4 个解决方案

解决方案1 8 已采纳 2015-06-12 13:24:32

解决方案2 3 2015-06-10 02:33:22

解决方案3 2 2015-06-12 06:17:18

解决方案4 2 2015-06-12 06:56:04

解决方案1
8 已采纳 2015-06-12 13:24:32

解决方案2
3 2015-06-10 02:33:22

解决方案3
2 2015-06-12 06:17:18

解决方案4
2 2015-06-12 06:56:04