[英]Python Pandas - using .loc to select with AND and OR on multiple columns
我有一种情况,我试图在一次传递中从数据框中选择一些场景。 以下代码是我目前正在使用的代码:
dfWater1 = left_merged.loc[left_merged.BVG_2M.isin(['34']) and left_merged.VHC_SC.isin(['6. Nil veg']) and left_merged.wetland.isin(['Estuarine wetlands (e.g. mangroves).', 'Lacustrine wetland (e.g. lake).']) | left_merged.RE.isin(['water', 'reef', 'ocean', 'estuary', 'canal'])].copy()
或者,使用一些额外的括号来包含 AND 并分隔 OR:
dfWater1 = left_merged.loc[(left_merged.BVG_2M.isin(['34']) and left_merged.VHC_SC.isin(['6. Nil veg']) and left_merged.wetland.isin(['Estuarine wetlands (e.g. mangroves).', 'Lacustrine wetland (e.g. lake).'])) | (left_merged.RE.isin(['water', 'reef', 'ocean', 'estuary', 'canal']))].copy()
基本上,我要求在以下位置选择行:
(
Column BVG_2M = 34
AND
Column VHC_SC = '6. Nil veg'
AND
Column wetland is one of the following ['Estuarine wetlands (e.g. mangroves).', 'Lacustrine wetland (e.g. lake).']
)
OR
(
Column RE is one of the following ['water', 'reef', 'ocean', 'estuary', 'canal']
)
数据集非常大,所以我想尝试保持选择速度快(因此使用 .loc 并以矢量化方式接近它),并尽量避免创建超过必要的数据帧以保存内存,如果可能的话。
我认为我真正的问题是我不确定如何构造 .loc 语句,或者即使我可以这样做。
错误信息
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
您应该使用&
代替and
并在每个条件周围用括号括起来。 在新行上格式化所有内容也有助于防止括号错误:
dfWater1 = left_merged.loc[((left_merged.BVG_2M.isin(['34'])) &
(left_merged.VHC_SC.isin(['6. Nil veg'])) &
(left_merged.wetland.isin(['Estuarine wetlands (e.g. mangroves).', 'Lacustrine wetland (e.g. lake).'])))
| (left_merged.RE.isin(['water', 'reef', 'ocean', 'estuary', 'canal']))].copy()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.