While working with pandas I ran into an issue which I can't quite explain. Let me give an example where the DataFrame is called "reviews":
The following code doesn't run:
reviews[(reviews["points"] >= 95) & (reviews["country"] in ["Australia"])]
Instead one can use:
reviews[(reviews["points"] >= 95) & (reviews["country"].isin(["Australia"]))]
My first assumption was that this is caused by the way the bitwise operator &
works, but testing this I was suprised to find out the follwing line equals to True: True & ("hi" in ["hi", "Hello"])
Obviously reviews["country"]
is not just a str. I guess with the operator >=
some magic happens that is not implemented for in
. Therefore, isin()
is necessary. Maybe someone can explain this further / better?
The example works with something like the following DataFrame:
country description designation points
0 Italy Aromas Vulkà Bianco 87
This structure is basically taken from https://www.kaggle.com/learn/pandas lesson 2.9.
Error-MSG: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
in
is a python keyword, while isin
is a method for the Series which checks "whether each element in the DataFrame is contained in values." link
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.