简体   繁体   English

数据框将新的系列列附加到数据

[英]dataframe append new series column with data

I have a Panda DataFrame structure and I want to add another column to it, but I can't do it with append, add or insert. 我有一个Panda DataFrame结构,我想向它添加另一列,但是我不能通过追加,添加或插入来实现。

I'm trying to replicate the portfolio data with the Panda's built-in function, because this script doesn't give me correct data if the period that I request is lower than ~ 1,5 years while data must be obtained even for two days if I want. 我正在尝试使用Panda的内置函数复制投资组合数据,因为如果我请求的时间段少于1.5年,而该脚本甚至必须获得两天的时间,则此脚本不会为我提供正确的数据如果我想要的话。 So here's the script that I want to rewrite: 所以这是我要重写的脚本:

import QSTK.qstkutil.qsdateutil as du
import QSTK.qstkutil.tsutil as tsu
import QSTK.qstkutil.DataAccess as da

import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd

ls_symbols = ["AAPL", "GLD", "GOOG", "$SPX", "XOM"]
dt_start = dt.datetime(2006, 1, 1)
dt_end = dt.datetime(2010, 12, 31)
dt_timeofday = dt.timedelta(hours=16)
ldt_timestamps = du.getNYSEdays(dt_start, dt_end, dt_timeofday)

c_dataobj = da.DataAccess('Yahoo')
ls_keys = ['open', 'high', 'low', 'close', 'volume', 'actual_close']
ldf_data = c_dataobj.get_data(ldt_timestamps, ls_symbols, ls_keys)
**d_data = dict(zip(ls_keys, ldf_data))**

d_data = dict(zip(ls_keys, ldf_data)) is what I want to replicate because it doesn't fetch the data that I want, but I need to figure out a way to append a new column to my dict. d_data = dict(zip(ls_keys, ldf_data))是我要复制的内容,因为它没有获取我想要的数据,但是我需要找出一种将新列附加到我的字典上的方法。 Here is my script: 这是我的脚本:

from pandas.io.data import DataReader, DataFrame
import QSTK.qstkutil.qsdateutil as du
import QSTK.qstkutil.DataAccess as da

import datetime as dt

def get_historical_data(symbol, source, date_from, date_to):
    global data_validator
    symbol_data = {}

    ls_keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']

    for key in ls_keys:
        symbol_data[key] = DataFrame({})

    dataframe_open = DataFrame({})

    for item in symbol:
        print 'Fetching data for:', item
        current_data = DataReader(str(item), source, date_from, date_to)
        dataframe_open = {item : current_data['Open']}
        if len(symbol_data['Open'].columns) == 0:
            symbol_data['Open'] = DataFrame(dataframe_open)
        else:
            **#i want to add the new column here but can't seem to do this.**
            #symbol_data['Open'].loc[:item] = DataFrame(dataframe_open)
            pass
    return symbol_data

PS I call the func with these parameters for testing purposes: PS我将这些参数称为func以进行测试:

test = get_historical_data(['SPY', 'DIA'], 'yahoo', datetime(2015,1,1), datetime(2015,1,31))

Does the following help? 以下帮助吗? Have not tested yet, but should work in principle... Just put the data in arrays of equal length and construct the data frame from that. 尚未测试,但原则上应能工作...只是将数据放入等长的数组中,然后从中构造数据帧。

def get_historical_data(symbols=[], source=None, date_from=None, date_to=None):
    global data_validator
    symbol_data = {}

    ls_keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
    data = []
    for item in ls_keys:
        data.append(DataReader(str(item), source, date_from, date_to)
    symbol_dataframe=DataFrame(data=data, columns=ls_keys)
    #symbol_dataframe = DataFrame()
    #for key in ls_keys:
    #    symbol_data[key] = DataFrame({})

    #dataframe_open = DataFrame({})

    #for item in symbols:
    '''    print 'Fetching data for:', item
        current_data = DataReader(str(item), source, date_from, date_to)
        dataframe_open = {item : current_data['Open']}
        #print(dataframe_open)
        if len(symbol_data['Open'].columns) == 0:
            symbol_data['Open'] = DataFrame(dataframe_open)
        else:
            #i want to add the new column here but can't seem to do this.**
            symbol_data['Open'] = DataFrame(dataframe_open)
            symbol_data.head()
    '''
    return symbol_dataframe

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM