MySQL 累計產品分組依據

Question

我一直在使用 WRDS/CRSP 數據集（由 UPenn 維護的用於學術研究的股票價格數據庫）。 我一直在用 Python 下載數據並將其插入到我的本地 MySQL 數據庫中。

數據如下所示，主鍵為 (quote_date, security_id)：

quote_date  security_id tr              accum_index
10-Jan-86   10002       null            1000
13-Jan-86   10002       -0.026595745    973.4042548
14-Jan-86   10002       0.005464481     978.7234036
15-Jan-86   10002       -0.016304348    962.7659569
16-Jan-86   10002       0               962.7659569
17-Jan-86   10002       0               962.7659569
20-Jan-86   10002       0               962.7659569
21-Jan-86   10002       0.005524862     968.0851061
22-Jan-86   10002       -0.005494506    962.765957
23-Jan-86   10002       0               962.765957
24-Jan-86   10002       -0.005524862    957.4468078
27-Jan-86   10002       0.005555556     962.7659569
28-Jan-86   10002       0               962.7659569
29-Jan-86   10002       0               962.7659569
30-Jan-86   10002       0               962.7659569
31-Jan-86   10002       0.027624309     989.3617013
3-Feb-86    10002       0.016129032     1005.319148
4-Feb-86    10002       0.042328041     1047.872338
5-Feb-86    10002       0.04568528      1095.744679

我需要計算 accum_index 列，它基本上是股票總回報的指數，計算如下：

accum_index_t = accum_index_{t-1} * (1 + tr_t)

該表有 80m 行。 我已經寫了一些代碼來迭代每個 security_id 並計算累積乘積，如下所示：

select @sid := min(security_id)
from stock_prices;

create temporary table prices (
    quote_date datetime,
    security_id int,
    tr double null,
    accum_index double null,
    PRIMARY KEY (quote_date, security_id)
);

while @sid is not null
do

    select 'security_id', @sid;
    select @accum := null;

    insert into prices
    select quote_date, security_id, tr, accum_index
    from stock_prices
    where security_id = @sid
    order by quote_date asc;

    update prices
    set accum_index = (@accum := ifnull(@accum * (1 + tr), 1000.0));

    update stock_prices p use index(PRIMARY), prices a use index(PRIMARY)
    set p.accum_index = a.accum_index
    where p.security_id = a.security_id
    and p.quote_date = a.quote_date;

    select @sid := min(security_id)
    from stock_prices
    where security_id > @sid;

    delete from prices;

end while;

drop table prices;

但這太慢了，我的筆記本電腦上每個安全性大約需要一分鍾，計算這個系列需要數年時間。 有沒有辦法將其矢量化？

干杯，史蒂夫

Answer 1

如果您使用的是 MySQL 8，則可以使用窗口函數來創建累積乘積。 不幸的是，在我知道的任何 SQL 數據庫中都沒有PROD()聚合/窗口函數，但您可以使用EXP(SUM(LOG(factor)))模擬它：

SELECT
  quote_date,
  security_id,
  tr,
  1000 * (EXP(SUM(LOG(1 + COALESCE(tr, 1)))
    OVER (PARTITION BY security_id ORDER BY quote_date)) - 1)
    AS accum_index
FROM stock_prices

dbfiddle 在這里。

Answer 2

如果您使用的是 MySQL 5，則可以模擬此函數將 current 與最后一個 tr 逐行相乘。 之后我們取最后一行的累計值。

tr 是百分比值，現在？ 所以讓我們為每個 tr 加 1。

第一個存儲的值將是中性 1。

試試這個：

SET @variation = 1;
SET @row_number = 0;

SELECT accumulateTr
FROM
    (SELECT
        @row_number := (@row_number + 1) AS rowNumber,
        @variation := (1 + variation) * @variation AS accumulateTr
     FROM
        prices) accumulatedTrs
ORDER BY rowNumber DESC
LIMIT 1;

MySQL 累計產品分組依據

問題描述

2 個解決方案

解決方案1
5 2018-09-17 18:27:43

解決方案2
0 2021-05-21 14:22:55

MySQL 累計產品分組依據

問題描述

2 個解決方案

解決方案1 5 2018-09-17 18:27:43

解決方案2 0 2021-05-21 14:22:55

解決方案1
5 2018-09-17 18:27:43

解決方案2
0 2021-05-21 14:22:55