简体   繁体   中英

python problem with pandas dataframe of list

I have a Pandas dataframe in which every row is a list.

I want to search a value, but I've got an error. and I know my value exists.

I check this:

df["text list"][1] == ['رهبری']

and got:

True

then i need this:

df[df["text list"] == ['رهبری']]

and got this error:

    ValueError                                Traceback (most recent call last)
    <ipython-input-42-f14f1b2306ec> in <module>
    ----> 1 df[df["text list"] == ['رهبری']]

    ~/.local/lib/python3.6/site-packages/pandas/core/ops/__init__.py in wrapper(self, other, axis)
       1205             # as it will broadcast
       1206             if other.ndim != 0 and len(self) != len(other):
    -> 1207                 raise ValueError("Lengths must match to compare")
       1208 
       1209             res_values = na_op(self.values, np.asarray(other))

    ValueError: Lengths must match to compare

When you pass the list directly to your DataFrame for comparison, it expects an array with the same size to make an element wise comparison.

To avoid this, we can use apply to check on each row if the list is present:

# example dataframe
>>> df = pd.DataFrame({'text list':[['aaa'], ['bbb'], ['ccc']]})
>>> df
  text list
0     [aaa]
1     [bbb]
2     [ccc]

Use Series.apply to check for [bbb] :

>>> m = df['text list'].apply(lambda x: x == ['bbb'])
>>> df[m]
  text list
1     [bbb]

Since we are using apply which is basically a "loopy" implementation in the background. We can avoid using the overhead of pandas and use list comprehension:

>>> m = [x == ['bbb'] for x in df['text list']]
>>> df[m]
  text list
1     [bbb]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM