简体   繁体   中英

Iterating through DataFrame using pandas for stocks

I am relatively new to python and pandas . I have a DataFrame with a few stocks and their associated 'low' prices for the past few days. I am trying to iterate through each stock (right now I only have 3 but will eventually have thousands) and then for each stock, I want to see if the current day's 'low' price is greater than yesterday's low price AND I want to see if yesterday's low price is less than the low price from 2 days ago. For each stock that meets this criteria, I eventually want to export them to a csv file.

list = ['IBM', 'AMZN', 'FB'] 

stockData = DataReader(list,  'yahoo', datetime(2016,06,8), datetime.today().utcnow())

low = stockData['Low']

low0 = low.iloc[-1]
low1 = low.iloc[-2]
low2 = low.iloc[-3]

The variables low0, low1, and low2 are probably not necessary but I do like how they splice out the specific data I want.

I then tried iterating over each stock in my list with my function:

for stock in list:
    if low0 > low1 and low1 < low2:
        print True
    else: 
        print False

This is the error I get: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I would appreciate any input.

To identify whether the Low has been increasing for the last three days, you can use the following:

stockData = stockData.sort_index(ascending=False).iloc[:3] # reverse order, use last three days

either a condition that compares Low between adjacent days, and returns True if Low has been increasing in both cases:

stockData[(stockData.Low < stockData.Low.shift(1)) & (stockData.Low.shift(1) < stockData.Low.shift(2))]

Or check if the difference between the last three Low prices is negative (because the most recent day now comes first) for all days:

stockData.Low.diff().dropna() < 0).all()

So for your application:

for stock in ['IBM', 'AMZN', 'FB']:
    stockData = DataReader(stock, 'yahoo', datetime(2016, 6, 8), datetime.today().utcnow()).sort_index(ascending=False).iloc[:3]
    print('\n', stockData.Low)
    print(stock, (stockData.Low.diff().dropna()<0).all())
    print(stock, stockData[(stockData.Low < stockData.Low.shift(1)) & (stockData.Low.shift(1) < stockData.Low.shift(2))].Low.any())


 Date
2016-06-15    150.600006
2016-06-14    150.399994
2016-06-13    150.279999
Name: Low, dtype: float64
IBM True
IBM True

 Date
2016-06-15    713.349976
2016-06-14    712.270020
2016-06-13    711.159973
Name: Low, dtype: float64
AMZN True
AMZN True

 Date
2016-06-15    114.070000
2016-06-14    113.580002
2016-06-13    113.309998
Name: Low, dtype: float64
FB True
FB True

This is an example of a similar, but slightly different approach to this problem. I am using dummy values to demonstrate.

First, I create a dataframe.

dates = pd.date_range('20130101', periods=3)
IBM = [5,3,2]
AMZN = [1,7,6]
FB = [4,7,9]
df = pd.DataFrame({'IBM': IBM,'AMZN': AMZN,'FB':FB}, index=dates)
df
          AMZN  FB  IBM
2013-01-01  1   4   5
2013-01-02  7   7   3
2013-01-03  6   9   2

I use .shift() to track how much the values went up or down during the second and third day compared to the first day and the second day in the dataframe. I do this by subtracting df.shift(1) from df . The first day value will be replaced by NaN .

df - df.shift(1)
           AMZN     FB     IBM
2013-01-01  NaN     NaN     NaN  
2013-01-02  6.0     3.0     -2.0
2013-01-03  -1.0    2.0     -1.0

If you prefer True or False , you can check if the values are higher or lower than 0 . So, in this case, True will mean up and False means down and the first day, the starting value, will be replaced by False .

df - df.shift(1) > 0
            AMZN    FB      IBM
2013-01-01  False   False   False
2013-01-02  True    True    False
2013-01-03  False   True    False 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM