简体   繁体   中英

How do I iterate an input list over this function and consequently

I want to iterate a list with input variables over the following function, to then consequently return the output as a csv file. I have a big csv file which I first important to create a dataframe. I then want to get certain part of the dataframe, namely, the -10 days and +10 days around a certain date and for a certain stock.

The dataframe looks as follows (this is just a small part, in reality its 100k+ rows, for every day, all stock tickers, for the period 2011 - 2019)

Date           Symbol   ShortExemptVolume   ShortVolume     TotalVolume
2011-01-03     AAWW     0.0                     28369           78113.0
2011-01-03     AMD      0.0                     3183556         8095093.0
2011-01-03     AMRS     0.0                     14196           18811.0
2011-01-03     ARAY     0.0                     31685           77976.0
2011-01-03     ARCC     0.0                     177208          423768.0

The function is as follows. What it does is it filters the dataframe for the stock ticker and then the dates (-10 and +10 days around a given specific date).

import pandas as pd
from datetime import datetime
import urllib
import datetime

def get_data(issue_date, stock_ticker):
    df = pd.read_csv (r'C:\Users\name\document.csv')
    df['Date'] = pd.to_datetime(df['Date'], format="%Y%m%d")

    x = -10 #set window range
    y = -(x)
    date_1 = datetime.datetime.strptime(issue_date, "%Y-%m-%d")
    before_date = pricing_date = date_1 + datetime.timedelta(days=x)       days of issue date
    after_date = date_1 + datetime.timedelta(days=y)

    cond1 = df['Date'] >= before_date
    cond2 = df['Date'] <= after_date
    cond3 = df['Symbol'] == 'stock_ticker'

    short_data = df[cond1 & cond2 & cond3]

    return [short_data]

I have a list with a couple hundred rows that contain a specific stock ticker and issue date, for example like this:

ARAY    4/24/2014
ACET    11/16/2015
ACET    11/16/2015
AEGR    8/15/2014
ATSG    9/29/2017

I would like to iterate the list with stocks and their respective date, over the function and get the output in csv format. The output should be 20 dates for every row in the input file.

Any tips or help is welcome

Consider building a list of data frames generated from function and compile together with concat . Also, there is no need to separately call to_datetime as you can use parse_dates argument in read_csv :

def get_data(issue_date, stock_ticker):
    df = pd.read_csv (r'C:\Users\name\document.csv', parse_dates=['Date'])

    x = -10 #set window range
    y = -(x)
    date_1 = dt.strptime(issue_date, "%m/%d/%Y")        # MATCH ACCORDING TO INPUT
    before_date = date_1 + datetime.timedelta(days=x)  
    after_date = date_1 + datetime.timedelta(days=y)

    cond1 = df['Date'] >= before_date
    cond2 = df['Date'] <= after_date
    cond3 = df['Symbol'] == stock_ticker                # REMOVE SINGLE QUOTES

    short_data = df[cond1 & cond2 & cond3]

    return short_data                                   # REMOVE LIST BRACKETS

stock_date_list = [['ARAY', '4/24/2014'],
                   ['ACET', '11/16/2015'],
                   ['ACET', '11/16/2015'],
                   ['AEGR', '8/15/2014'],
                   ['ATSG', '9/29/2017']]

# LIST COMPREHENSION ITERATIVELY CALLING FUNCTION
df_list = [get_data(i[1], i[0]) for i in stock_date_list)]

# SINGLE DATA FRAME COMPILATION
final_df = pd.concat(df_list, ignore_index=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM