Python Pandas DataFrame produces a keyerror

Question

I have the following code, putting a list of stock exchange orders in a dataframe using Pandas, I need to sort the data frame by value but can't understand why I get a keyerror.

Here's the code:

for row in values:
    if row[1] == '#N/A' or row[2] == '#N/A' or row[3] == '#N/A':
        continue
    symbol = row[0]
    price = float(row[3])
    open_p = float(row[1])
    previous_close = float(row[2])
    volume = float(row[4])
    change_at_open = round((open_p - previous_close)/previous_close,4)
    change_since_open = round((price - open_p)/open_p,4)

    if change_at_open > min_change_at_open and change_since_open < -revert_level and price > 1 and volume > 50000:
        quantity = math.floor(ptf_value/num_pos/price)
        #print('%s, %s, %s, %s, %s' % (symbol, price, change_at_open, change_since_open, quantity))
        signal_count += 1
        orders[signal_count] = {'symbol':symbol,'price':price,'quantity':quantity, 'change_at_open':change_at_open}

    df = pd.DataFrame(data = orders)
    df = df.T
    df.nlargest(10,['change_at_open'])

The contact of the data frame df is this:

   change_at_open price quantity symbol
1          0.1634  1.55      645   IZEA
2          0.1867    64       15   BJRI
3          0.1101  10.6       94   DFRG
4          0.0741  13.6       73   DGII
5           0.087  23.2       43   EHTH
6          0.1889   2.2      454   HSGX
7          0.0652  17.6       56   CHRS
8          0.1054  3.74      267   MEIP
9          0.0758    44       22   NATI
10         0.0812  1.86      537   OBLN
11         0.0763  1.11      900   ORPN
12         0.0956  6.06      165   RMBL
13         0.1662  73.8       13   TEAM
14         0.0789  2.85      350   TTPH
15         0.1185   1.3      769   VTVT

So column names seem pretty straight forward. I try to sort the df or get the 10 larget 'change_at_open' but I always get the following error:

Traceback (most recent call last):

  File "<ipython-input-133-6a99d27bb6bb>", line 157, in <module>
    df.nlargest(10,['change_at_open'])

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 4625, in nlargest
    columns=columns).nlargest()

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/algorithms.py", line 1081, in nlargest
    return self.compute('nlargest')

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/algorithms.py", line 1185, in compute
    dtype = frame[column].dtype

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2685, in __getitem__
    return self._getitem_column(key)

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2692, in _getitem_column
    return self._get_item_cache(key)

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2486, in _get_item_cache
    values = self._data.get(item)

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4115, in get
    loc = self.items.get_loc(item)

  File "/Users/gilles/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3065, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'change_at_open'

How can I debug this?

Answer 1

I found the reason for the problem which can't be spotted from the code I pasted in the question. It was just an indenting issue (I'm new to python and I was looking all day at the code but didn't see that!), the creation of the dataframe should have been after the for loop was finished, which is what was causing the issue.

Answer 2

It will only come when the key is not available.

So, to resolve that issue either you can put an If condition for every key just to check if that key exist or not, or you can handle it by exception handling using KeyError exception

if:
  # code block
except KeyError, e:
  pass

Python Pandas DataFrame produces a keyerror

Question

2 answers

solution1
0 2018-07-28 20:52:12

solution2
0 2018-07-28 20:54:58

Python Pandas DataFrame produces a keyerror

Question

2 answers

solution1 0 2018-07-28 20:52:12

solution2 0 2018-07-28 20:54:58

solution1
0 2018-07-28 20:52:12

solution2
0 2018-07-28 20:54:58