简体   繁体   English

通过Digital Ocean上的Celery Beat努力将对象保存到Django数据库

[英]Struggling to save objects to Django database thorough Celery Beat on Digital Ocean

I'm struggling to save objects through Celery Beat to my Django app (showing OHLC data). 我正在努力通过Celery Beat将对象保存到我的Django应用程序中(显示OHLC数据)。

This script works fine on local environment (saves 3M objects) but not on VPN like Digital Ocean. 该脚本可以在本地环境(保存3M对象)上正常运行,但不能在Digital Ocean等VPN上运行。 It saves a certain amount of objects (roughly 200K objects or 2GB) but then it removes other objects to add each new object, which is totally confusing. 它保存了一定数量的对象(大约200K对象或2GB),但是随后删除了其他对象以添加每个新对象,这完全令人困惑。

My stack 我的堆栈

  • Django Django的
  • Redis Redis的
  • Supervisor
  • Ubuntu Ubuntu的

I'm NOT using Supervisor on my local, so I think this is causing the issue but can't identify. 我不在本地使用Supervisor,因此我认为这是导致问题的原因,但无法确定。 Any feedback / help would be really appreciated. 任何反馈/帮助将不胜感激。

Script 脚本

@periodic_task(
    # run_every=(crontab(minute='*/1')),
    run_every=(crontab(minute='*/60')),
    name="load_data",
    ignore_result=False
)
def load_data():
# Forex OHLC
TOKEN = MYTOKEN
con = fxcmpy.fxcmpy(access_token = TOKEN, log_level = 'error')
start = dt.datetime(2010, 1, 1)
stop = dt.datetime.today()
df = pd.DataFrame(list(DatasourceItem.objects.filter(datasource__sub_category__exact='Forex').values('symbol')))

for i in df['symbol']:

    datasource_item_obj = DatasourceItem.objects.get(symbol=i)

    Ohlc.objects.filter(datasource = datasource_item_obj).delete()

    if datasource_item_obj.base_symbol:
        base_symbol = datasource_item_obj.base_symbol
        tar_symbol = datasource_item_obj.tar_symbol
        mod_symbol = base_symbol + "/" + tar_symbol
        sys_symbol = base_symbol + tar_symbol
    else:
        sys_symbol = datasource_item_obj.symbol
        mod_symbol = datasource_item_obj.symbol

    data = con.get_candles(mod_symbol, period='D1', start=start, stop=stop)
    del data['askopen']
    del data['askclose']
    del data['askhigh']
    del data['asklow']
    del data['tickqty']
    data.columns = ['Open', 'Close', 'High', 'Low']
    data = data[['Open', 'High', 'Low',  'Close']]
    data.insert(loc=0, column='Symbol', value=sys_symbol)
    data.reset_index(level=0, inplace=True)
    data.dropna()
    # .values = return numpy array
    data_list = data.values.tolist()
    for row in data_list:
        new_price = Ohlc(time = row[0], symbol = row[1], open_price = row[2], high_price = row[3], low_price = row[4], close_price = row[5], datasource = datasource_item_obj)
        new_price.save()

# Stock OHLC
start = dt.datetime.now() - dt.timedelta(days=(365.25*5))
stop = dt.datetime.today()

df = pd.DataFrame(list(DatasourceItem.objects.filter(datasource__sub_category__exact='Stock').values('symbol')))
for i in df['symbol']:
    datasource_obj = DatasourceItem.objects.get(symbol=i)
    old_price = Ohlc.objects.filter(datasource = datasource_obj).delete()

    symbol = datasource_obj.symbol
    data = get_historical_data(symbol, start=start, stop=stop, output_format='pandas')
    del data['volume']
    data.columns = ['Open', 'High', 'Low', 'Close']
    data.insert(loc=0, column='Symbol', value=symbol)
    data.reset_index(level=0, inplace=True)
    data.dropna()
    data_list = data.values.tolist()
    for row in data_list:
        price = Ohlc(time = row[0], symbol = row[1], open_price = row[2], high_price = row[3], low_price = row[4], close_price = row[5], datasource = datasource_obj)
        price.save()

Hey it's happening due to number of transaction happening on database, so try to optimize the data creation query, for example you can use bulk create instead of crating each object individual. 嘿,这是由于数据库上发生的事务数量引起的,所以请尝试优化数据创建查询,例如,您可以使用批量创建而不是创建每个对象单独的对象。

price_list
for row in data_list:
    price = Ohlc(time = row[0], symbol = row[1], open_price = row[2], high_price = row[3], low_price = row[4], close_price = row[5], datasource = datasource_obj)
    price_list.append(price)
Ohlc.objects.bulk(price_list)

may be possibilty it will not create large set off data in one go, then break data in chunks 1000. 可能不会一口气创建大量的抵销数据,然后将数据分解成块1000。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM