简体   繁体   English

Python:a ~ 怎么用来排除数据?

[英]Python: How is a ~ used to exclude data?

In the below code I know that it is returning all records that are outside of the buffer, but I'm confused as to the mechanics of how that is happening.在下面的代码中,我知道它正在返回缓冲区之外的所有记录,但我对这种情况的发生机制感到困惑。

I see that there is a "~" (aka a bitwise not) being used.我看到有一个“~”(也就是按位不)正在使用。 From some googling my understanding of ~ is that it returns the inverse of each bit in the input it is passed eg if the bit is a 0 it returns a 1. Is this correct if not could someone please ELI5?从一些谷歌搜索我对〜的理解是它返回它传递的输入中每个位的倒数,例如,如果该位为0,则返回1。如果不是有人可以请ELI5,这是否正确?

Could someone please explain the actual mechanics of how the below code is returning records that are outside of the "my_union" buffer?有人可以解释一下以下代码如何返回“my_union”缓冲区之外的记录的实际机制吗?

NOTE: hospitals and collisions are just geo dataframes.注意:医院和碰撞只是地理数据框。

coverage = gpd.GeoDataFrame(geometry=hospitals.geometry).buffer(10000) 
my_union = coverage.geometry.unary_union 
outside_range = collisions.loc[~collisions["geometry"].apply(lambda x: my_union.contains(x))]

I'm not sure exactly what you mean by the actual mechanics and it's hard to know for sure without seeing input and output, but I had a go at explaining it below if it is helpful:我不确定你所说的实际力学到底是什么意思,如果没有看到输入和 output,很难确定,但如果有帮助,我有一个 go 在下面解释它:

All rows from the geometry column in the collisions dataframe that contain any value in my_union will be excluded in the newly created outside_range dataframe.碰撞dataframe 中几何列中包含 my_union 中的任何值的所有行都将被排除在新创建的outside_range dataframe 中。

~ does indeed perform a bitwise not in python. ~确实在 python 中按位执行。 But here it is used to perform a logical not on each element of a list (or rather pandas Series) of booleans.但这里它用于对布尔值列表(或者更确切地说 pandas 系列)的每个元素执行逻辑非。 See this answer for an example.有关示例,请参见此答案

Let's assume the collisions GeoDataFrame contains points, but it will work similarly for other types of geometries.让我们假设collisions GeoDataFrame 包含点,但它对于其他类型的几何图形也同样适用。 Let me further change the code a bit:让我进一步更改代码:

coverage = gpd.GeoDataFrame(geometry=hospitals.geometry).buffer(10000) 
my_union = coverage.geometry.unary_union
within_my_union = collisions["geometry"].apply(lambda x: my_union.contains(x))
outside_range = collisions.loc[~within_my_union]

Then:然后:

  1. my_union is a single (Multi)Polygon. my_union是单个(多)多边形。

  2. my_union.contains(x) returns a boolean indicating whether the point x is within the my_union MultiPolygon. my_union.contains(x)返回一个 boolean 指示点x是否在my_union MultiPolygon 内。

  3. collisions["geometry"] is a pandas Series containing the points. collisions["geometry"]是包含点的 pandas 系列。

  4. collisions["geometry"].apply(lambda x: my_union.contains(x)) will run my_union.contains(x) on each of these points. collisions["geometry"].apply(lambda x: my_union.contains(x))将在每个点上运行my_union.contains(x) This will result in another pandas Series containing booleans, indicating whether each point is within my_union .这将导致另一个 pandas 系列包含布尔值,指示每个点是否在my_union内。

  5. ~ then negates these booleans, so that the Series now indicates whether each point is not within my_union . ~然后否定这些布尔值,因此 Series 现在指示每个点是否不在my_union内。

  6. collisions.loc[~within_my_union] then selects all the rows of collisions where the entry in ~within_my_union is True , ie all the points that don't lie within my_union . collisions.loc[~within_my_union]然后选择 ~ ~within_my_union中的条目为True的所有collisions行,即不在my_union内的所有点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM