简体   繁体   中英

ValueError: Buffer has wrong number of dimensions (expected 1, got 2) on if in statement

I'm trying to use an "if" statement inside a "for" cycle to check if the index of the current item in the cycle (index of a pandas Series containing the item), corresponds to one of the indexes of another Series, but doing so raises a ValueError. This is the line of code which gives problems:

if(ICM_items[ICM_items['track_id'] == i].index[0] in ICM_tgt_items.index.values.flatten().tolist()):

I tried changing both sides of the "in" statement with random integers or lists and it works, also the two items are built correctly, but when coupled in the statement they raise an error.

Hope someone can give me some hints on where's the problem or an alternative way to perform the same task.

ICM_items and ICM_tgt_items are both pandas.Series

Below there's the console error:

Traceback (most recent call last):
File "/Users/LucaButera/git/rschallenge/similarity_to_recommandable_builder.py", line 27, in <module>
dot[ICM_tgt_items[ICM_items[ICM_items['track_id'] == i].index[0]]] = 0
File "/Users/LucaButera/anaconda/lib/python3.6/site-packages/pandas/core/series.py", line 603, in __getitem__
result = self.index.get_value(self, key)
File "/Users/LucaButera/anaconda/lib/python3.6/site-packages/pandas/indexes/base.py", line 2169, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas/index.c:3557)
File "pandas/index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas/index.c:3240)
File "pandas/index.pyx", line 147, in pandas.index.IndexEngine.get_loc (pandas/index.c:4194)
File "pandas/index.pyx", line 280, in pandas.index.IndexEngine._ensure_mapping_populated (pandas/index.c:6150)
File "pandas/src/hashtable_class_helper.pxi", line 446, in pandas.hashtable.Int64HashTable.map_locations (pandas/hashtable.c:9261)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
[Finished in 1.26s]

I would recommend you simplify your expressions, use .loc , and keep an eye out for edge cases (such as track_id turning up empty for a given i ).
With the right test data, these steps should help you to narrow down your bug hunt.

Example ICM_items data:

import numpy as np
import pandas as pd

N = 7
max_track_id = 5
idx1 = ['A','B','C']
icm_idx = np.random.choice(idx1, size=N)
icm = {"track_id":np.random.randint(0, max_track_id, size=N)}
ICM_items = pd.DataFrame(icm, index=icm_idx)

ICM_items
   track_id
C         1
A         1
A         2
C         1
B         0
B         0
B         2

Example ICM_tgt_items data:

idx2 = ['A','B']
icm_tgt_idx = np.random.choice(idx2, size=N)
icm = np.random.random(size=N)
ICM_tgt_items = pd.DataFrame(icm, index=icm_tgt_idx)

          0
B  0.785614
A  0.976523
A  0.856821
B  0.098086
B  0.481140
A  0.686156
A  0.851714

Now simply the comparison and catch potential edge cases:

for i in range(max_track_id):
    mask = ICM_items['track_id'] == i
    try:
        # use .loc for indexing, no need to flatten() or use .values on the right.
        if ICM_items.loc[mask].index[0] in ICM_tgt_items.index:
            print("found")
        else:
            print("not found")
    # catch error if i not found in track_id
    except IndexError as e:           
        print(f"ERROR at i={i}: {e}")

Output:

found
not found
found
ERROR at i=3: index 0 is out of bounds for axis 0 with size 0
ERROR at i=4: index 0 is out of bounds for axis 0 with size 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM