I'm getting the data, trying to iterate the dataframe and add row by row. Trying to fetch stock data (single row) for every company

Question

I'm trying to iterate the dataframe and get the data and add row by row. Trying to fetch stock data (single row) for every company

The code is below:

df = pdr.get_data_yahoo('ABB.NS', start = "2021-6-2", end = "2021-6-3")
df

The output is:

            Open    High    Low Close   Adj Close   Volume
Date                        
2021-06-02  1698.0  1717.0  1668.0  1700.55 1700.55 314707

Similarly i've list of names of companies and want to fetch single row of company and add them row by row.

The list is

sym = symbol[:5]
sym

Output is:

['20MICRONS.NS', '21STCENMGM.NS', '3IINFOTECH.NS', '3MINDIA.NS', '3PLAND.NS']

The code with which i'm trying is

for i in sym:
    df = pdr.get_data_yahoo(i, start = "2021-6-2", end = "2021-6-3")

Output is:

            Open    High    Low Close   Adj Close   Volume
Date                        
2021-06-02  14.05   14.05   13.25   13.5    13.5    3861

Expected output is:

            Open    High    Low Close   Adj Close   Volume
Date                        
2021-06-02  14.05   14.05   13.25   13.5    13.5    3861
"           Other   Other   Other   Other   Other   Other
"           Other   Other   Other   Other   Other   Other
"           Other   Other   Other   Other   Other   Other
"           Other   Other   Other   Other   Other   Other

Other are the stock values according to the companies

output is only single row. I'm trying to get 5 rows because i'm iterating 5 company names.

If the company doesn't have data for the particular date it's returning error like

Exception in thread Thread-96:
Traceback (most recent call last):
  File "c:\python37\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "c:\python37\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\venka\all\lib\site-packages\multitasking\__init__.py", line 102, in _run_via_pool
    return callee(*args, **kwargs)
  File "C:\Users\venka\all\lib\site-packages\fix_yahoo_finance\__init__.py", line 322, in _download_one_threaded
    period, interval, prepost)
  File "C:\Users\venka\all\lib\site-packages\fix_yahoo_finance\__init__.py", line 333, in _download_one
    actions=actions, auto_adjust=auto_adjust)
  File "C:\Users\venka\all\lib\site-packages\fix_yahoo_finance\__init__.py", line 246, in history
    raise ValueError(self.ticker, err_msg)
ValueError: ('ANMOL.NS', 'No data found, symbol may be delisted')

ANMOL.NS Symbol company data is not present for particular date. How to give null values in those place?

Answer 1

in each iteration of your for-loop, your overwrite the previous value of 'df'. One way to resolve would be:

df_list = []
for i in sym:
    df_list.append(pdr.get_data_yahoo(i, start = "2021-6-2", end = "2021-6-3"))
df = pd.concat(df_list, axis=0)

EDIT: I see you have 'date' as the index of the df. You will need to play around with that so that your final df makes sense.

Answer 2

get_data_yahoo can take a list as input then stack to convert to long-format:

sym = ['20MICRONS.NS', '21STCENMGM.NS', '3IINFOTECH.NS', '3MINDIA.NS',
       '3PLAND.NS']
df = pdr.get_data_yahoo(sym, start="2021-6-2", end="2021-6-3").stack()

df :

Attributes                   Adj Close         Close          High           Low          Open    Volume
Date       Symbols                                                                                      
2021-06-02 20MICRONS.NS      60.700001     60.700001     61.900002     59.500000     59.950001    374552
           21STCENMGM.NS     15.650000     15.650000     15.650000     15.250000     15.250000      1810
           3IINFOTECH.NS      9.200000      9.200000      9.300000      8.650000      8.850000  39107857
           3MINDIA.NS     25967.199219  25967.199219  26000.000000  25543.750000  25640.000000      3698
           3PLAND.NS         13.500000     13.500000     14.050000     13.250000     14.050000      3861
2021-06-03 20MICRONS.NS      62.549999     62.549999     64.349998     61.099998     62.250000    401022
           21STCENMGM.NS     15.950000     15.950000     15.950000     15.950000     15.950000       949
           3IINFOTECH.NS      8.950000      8.950000      9.250000      8.900000      9.200000  17823524
           3MINDIA.NS     26261.800781  26261.800781  26300.000000  25900.000000  25967.199219      2713
           3PLAND.NS         13.950000     13.950000     14.100000     13.400000     14.000000     19728

(Optional reset_index to turn the MultiIndex into Columns)

df = (
    pdr.get_data_yahoo(sym, start="2021-6-2", end="2021-6-3")
        .stack()
        .reset_index()
)

df :

Attributes       Date        Symbols     Adj Close         Close          High           Low          Open    Volume
0          2021-06-02   20MICRONS.NS     60.700001     60.700001     61.900002     59.500000     59.950001    374552
1          2021-06-02  21STCENMGM.NS     15.650000     15.650000     15.650000     15.250000     15.250000      1810
2          2021-06-02  3IINFOTECH.NS      9.200000      9.200000      9.300000      8.650000      8.850000  39107857
3          2021-06-02     3MINDIA.NS  25967.199219  25967.199219  26000.000000  25543.750000  25640.000000      3698
4          2021-06-02      3PLAND.NS     13.500000     13.500000     14.050000     13.250000     14.050000      3861
5          2021-06-03   20MICRONS.NS     62.549999     62.549999     64.349998     61.099998     62.250000    401022
6          2021-06-03  21STCENMGM.NS     15.950000     15.950000     15.950000     15.950000     15.950000       949
7          2021-06-03  3IINFOTECH.NS      8.950000      8.950000      9.250000      8.900000      9.200000  17823524
8          2021-06-03     3MINDIA.NS  26261.800781  26261.800781  26300.000000  25900.000000  25967.199219      2713
9          2021-06-03      3PLAND.NS     13.950000     13.950000     14.100000     13.400000     14.000000     19728

Explicit error handling on sequential reads:

import pandas as pd
import pandas_datareader as pdr
from pandas_datareader._utils import RemoteDataError

sym = ['20MICRONS.NS', '21STCENMGM.NS', '3IINFOTECH.NS', '3MINDIA.NS',
       '3PLAND.NS', 'ANMOL.NS']

dfs = []
for s in sym:
    try:
        dfs.append(pdr.get_data_yahoo(s, start="2021-6-2", end="2021-6-3"))
    except RemoteDataError:
        print(f'{s} could not be resolved')

df = pd.concat(dfs)

print(df)

Output:

ANMOL.NS could not be resolved
                    High           Low  ...    Volume     Adj Close
Date                                    ...                        
2021-06-02     61.900002     59.500000  ...    374552     60.700001
2021-06-03     64.349998     61.099998  ...    401022     62.549999
2021-06-02     15.650000     15.250000  ...      1810     15.650000
2021-06-03     15.950000     15.950000  ...       949     15.950000
2021-06-02      9.300000      8.650000  ...  39107857      9.200000
2021-06-03      9.250000      8.900000  ...  17823524      8.950000
2021-06-02  26000.000000  25543.750000  ...      3698  25967.199219
2021-06-03  26300.000000  25900.000000  ...      2713  26261.800781
2021-06-02     14.050000     13.250000  ...      3861     13.500000
2021-06-03     14.100000     13.400000  ...     19728     13.950000

I'm getting the data, trying to iterate the dataframe and add row by row. Trying to fetch stock data (single row) for every company

Question

2 answers

solution1
0 2021-06-03 10:19:45

solution2
0 ACCPTED 2021-06-03 10:33:23

I'm getting the data, trying to iterate the dataframe and add row by row. Trying to fetch stock data (single row) for every company

Question

2 answers

solution1 0 2021-06-03 10:19:45

solution2 0 ACCPTED 2021-06-03 10:33:23

solution1
0 2021-06-03 10:19:45

solution2
0 ACCPTED 2021-06-03 10:33:23