简体   繁体   English

R中使用方括号的空间逆子集

[英]Spatial inverse subset using square brackets in R

I have a spatial point data frame -> spatial_points我有一个空间点数据框 -> spatial_points

and a polygon -> spatial_poly和一个多边形 - > spatial_poly

I can subset all points within the polygon using我可以使用

subset_within <- spatial_points[spatial_poly,]  which is nice and intuitive.

But if I want to subset all points outside the polygon, I can't use但是如果我想对多边形外的所有点进行子集化,我不能使用

subset_ouside <- spatial_points[-spatial_poly,]

This question has been asked before, and the answer was to use gDifference() from the rgeos package.这个问题以前有人问过,答案是使用rgeos包中的gDifference() Fine.美好的。

My question is, why does [ ] work for selecting within, not the inverse?我的问题是,为什么 [ ] 用于选择内部,而不是相反? I don't really understand the error message我不太明白错误信息

Error in h(simpleError(msg, call)) : error in evaluating the argument 'i' in selecting a method for function '[': invalid argument to unary operator h(simpleError(msg, call)) 中的错误:在为函数“[”选择方法时评估参数“i”时出错:一元运算符的参数无效

Just curious.只是好奇。 Thanks.谢谢。

EDIT编辑

Here is an example borrowed from Subset spatial points with a polygon这是从具有多边形的子集空间点借用的示例

require(rgeos)
require(sp)

##create spdf
coords=expand.grid(seq(150,151,0.1),seq(-31,-30,0.1))
spdf=data.frame("lng"=coords[,1],"lat"=coords[,2])
coordinates(spdf) = ~lng+lat
proj4string(spdf)<- CRS("+init=epsg:4326")
plot(spdf)

##create poly
poly1 = SpatialPolygons(list(Polygons(list(Polygon(cbind(c(150.45,150.45,150.75,150.75,150.45),c(-30.75,-30.45,-30.45,-30.75,-30.75)))),ID=1)))
proj4string(poly1)<- CRS("+init=epsg:4326")

##get points withing polygon
points_within <-spdf[poly1,]  # this works

plot(spdf)
plot(poly1, add=T)
plot(points_within,col="blue",pch=16,add=T)

##get points outside polygon
points_outside <-spdf[-poly1,]  # this does not work - why??

In this simple example one can use gDifference() , which works in this example.在这个简单的例子中,可以使用gDifference() ,它在这个例子中起作用。 However, my SpatialPointDataframe is very large, and using gDifference crashes R.但是,我的 SpatialPointDataframe 非常大,使用 gDifference 会使 R 崩溃。

When you do df[2, 1] in R, you are actually calling a function.当您在 R 中执行df[2, 1]时,您实际上是在调用一个函数。 The function is '['(df, 1, 2) .函数是'['(df, 1, 2) It's just the parser hides this from you, and this allows you to write code in a more natural way.只是解析器对您隐藏了这一点,这允许您以更自然的方式编写代码。

If you think about it, the [ operator does different things depending on the type of object you are using, even if the operations are conceptually similar.如果您考虑一下, [运算符会根据您使用的对象类型执行不同的操作,即使这些操作在概念上是相似的。 The actual code that returns a subset of a numeric vector is different from the code that returns the subset of a matrix, or of a list.返回数值向量子集的实际代码与返回矩阵或列表子集的代码不同。 In fact, there are some objects in R for which calling the [ function doesn't make sense and isn't implemented.事实上,R 中有一些对象调用[函数没有意义并且没有实现。 For example, if you try to call it on a function name:例如,如果您尝试在函数名称上调用它:

print[1]
#> Error in print[1] : object of type 'closure' is not subsettable

If you create a complex new class in R with various different members, you need to define what the [ operator means, and you need to implement it.如果你在 R 中用各种不同的成员创建一个复杂的新类,你需要定义[运算符的含义,并且你需要实现它。 What does it mean to subset a SpatialPoints class by a SpatialPolygon class?SpatialPoints类对SpatialPolygon类进行子集化是什么意思? R has no way to know that on its own, so when the author of the sp package created the SpatialPolygons class, he had to write the methods that do the subsetting based on the operands that are passed to the operator [ . R 本身无法知道这一点,因此当sp包的作者创建SpatialPolygons类时,他必须编写基于传递给运算符[的操作数进行子集化的方法。 You can see the source code here .您可以在此处查看源代码。

If you trace the logic through, you will see that in the case of spdf[poly1,] , the subset is determined by the use of other spatial functions, and boils down to如果你跟踪逻辑,你会看到在spdf[poly1,]的情况下,子集是由其他空间函数的使用决定的,归结为

which(!is.na(over(spdf, geometry(poly1))))
#> 39 40 41 50 51 52 61 62 63 
#> 39 40 41 50 51 52 61 62 63

And these numeric subsets are then used to subset the actual polygons to return a new object consisting of just the subset.然后使用这些数字子集对实际多边形进行子集化,以返回仅由该子集组成的新对象。 That means we could get the points_outside in a similar way:这意味着我们可以以类似的方式获得points_outside

points_within  <- spdf[poly1,] 
points_outside <- spdf[which(is.na(over(spdf, geometry(poly1))))]

plot(spdf)
plot(poly1, add = TRUE)
plot(points_within, col="blue", pch = 16, add = TRUE)
plot(points_outside, col="red", pch = 16, add = TRUE)

在此处输入图片说明

But to answer your main question, which is why does spdf[-poly1,] not work, you have to realise that this actually means '['(spdf, -poly1) .但是要回答您的主要问题,即为什么spdf[-poly1,]不起作用,您必须意识到这实际上意味着'['(spdf, -poly1) For this to be evaluated, it would first be necessary to evaluate -poly1 , but if you try that, then you get:要对此进行评估,首先需要评估-poly1 ,但是如果您尝试这样做,则会得到:

-poly1
#> Error in -poly1 : invalid argument to unary operator

Of course, it doesn't really make sense to apply the - operator to a SpatialPoints object on its own.当然,将-运算符SpatialPoints应用于SpatialPoints对象并没有真正意义。 Take the points away from what ?拿分从什么

In truth, it would be possible to write the function so it works in this way, but it would require a complex bit of non-standard evaluation.事实上,可以编写函数使其以这种方式工作,但这需要一些复杂的非标准评估。 You could submit it as a feature request on that GitHub page, but personally I'd be happy using the function above.您可以在该 GitHub 页面上将其作为功能请求提交,但我个人很乐意使用上述功能。

I hope that makes things clearer.我希望这能让事情变得更清楚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM