简体   繁体   English

在R中的一个位置处理许多点

[英]Handling many points at one position in R

I have a question regarding data handling in R. I have two datasets. 我对R中的数据处理有疑问。我有两个数据集。 Both are originally .csv files. 两者都是.csv文件。 I've prepared two example Datasets: 我准备了两个示例数据集:

Table A - Persons 表A - 人员
http://pastebin.com/HbaeqACi http://pastebin.com/HbaeqACi

Table B - City 表B - 城市
http://pastebin.com/Fyj66ahq http://pastebin.com/Fyj66ahq

To make it as less work as possible the corresponding R Code for loading and visualizing. 尽可能少地使用相应的R代码进行加载和可视化。

# Read csv files
# check pastebin links and save content to persons.csv and city.csv.
persons_dataframe = read.csv("persons.csv", header = TRUE)
city_dataframe = read.csv("city.csv", header = TRUE)
# plot them on a map
# load used packages
library(RgoogleMaps)
library(ggplot2)
library(ggmap)
library(sp)

persons_ggplot2 <- persons_dataframe
city_ggplot2 <- city_dataframe
gc <- geocode('new york, usa')
center <- as.numeric(gc)  
G <- ggmap(get_googlemap(center = center, color = 'color', scale = 4, zoom = 10, maptype = "terrain", frame=T), extent="panel")
G1 <- G + geom_point(aes(x=POINT_X, y=POINT_Y ),data=city_dataframe, shape = 22, color="black", fill = "yellow", size = 4) + geom_point(aes(x=POINT_X, y=POINT_Y ),data=persons_dataframe, shape = 8, color="red", size=2.5)
plot(G1)

As a result I have a map, which visulaizes all cities and persons. 因此,我有一张地图,可以照亮所有城市和人。
My problem: All persons are distributed only on these three cities. 我的问题:所有人只在这三个城市分发。

My questions: 我的问题:

  1. A more general questions: Is this a problem for R? 更一般的问题:这对R来说是个问题吗?
  2. I want to create something like a bubble map, which visualized the amount of persons at one position. 我想创建一个类似于气泡图的东西,它可以显示一个位置的人数。 Like: In City A there are 20 persons, in City B are 5 persons. 如:在城市A中有20人,在城市B中有5人。 The position at city A should get a bigger bubble than City B. A市的位置应该比B市更大。
  3. I want to create a label, which states the amount of persons at a certain position. 我想创建一个标签,说明某个位置的人数。 I've already tried to realize this with the ggplo2 geom_text options, but I can't figure out how to sum up all points at a certain position and write this to a label. 我已经尝试用ggplo2 geom_text选项来实现这一点,但我无法弄清楚如何在某个位置总结所有点并将其写入标签。
  4. A more theoretical approach (maybe I come back to this later on): I want to create something like a density map / cluster map, which shows the area, with the highest amount of persons. 一个更理论化的方法(也许我稍后再回过头来):我想创建一个类似密度图/聚类地图的东西,它显示了人数最多的区域。 I've already search for some packages, which I could use. 我已经搜索了一些我可以使用的软件包。 Suggested ones were SpatialEpi, spatstat and DCluster. 建议的是SpatialEpi,spatstat和DCluster。 My question: Do I need the distance from the persons to a certain object (let's say supermarket) to perform a cluster analyses? 我的问题:我是否需要从人到特定物体(让我们说超市)的距离来进行聚类分析?

Hopefully, these were not too many questions. 希望这些问题不是太多。
Any help is much appreciated. 任何帮助深表感谢。 Thanks in advance! 提前致谢!

Btw: Is there any better help to prepare a question containing example datasets? 顺便说一句:有没有更好的帮助来准备包含示例数据集的问题? Should I upload a file somewhere or is the pastebin way okay? 我应该在某处上传文件还是以pastebin的方式好吗?

You can create the bubble chart by counting the number in each city and mapping the size of the points to the counts: 您可以通过计算每个城市中的数字并将点的大小映射到计数来创建气泡图:

library(plyr)
persons_count <- count(persons_dataframe, vars = c("city", "POINT_X", "POINT_Y"))

G + geom_point(aes(x=POINT_X, y=POINT_Y, size=freq),data=persons_count, color="red")

You can map the counts to the area of the points, which perhaps gives a better sense of the relative sizes: 您可以将计数映射到点的区域,这可能更好地了解相对大小:

G + geom_point(aes(x=POINT_X, y=POINT_Y, size=freq),data=persons_count, color="red") +
    scale_size_area(breaks = unique(persons_count$freq))

You can add the frequency labels, though this is somewhat redundant with the size scale legend: 您可以添加频率标签,但这对于尺寸比例图例有些多余:

G + geom_point(aes(x=POINT_X, y=POINT_Y, size=freq),data=persons_count, color="red") +
    geom_text(aes(x = POINT_X, y=POINT_Y, label = freq), data=persons_count) +
    scale_size_area(breaks = unique(persons_count$freq))

You can't really plot densities with your example data because you only have three points. 您无法使用示例数据绘制密度,因为您只有三个点。 But if you had more fine-grained location information you could calculate and plot the densities using the stat_density2d function in ggplot2. 但是如果你有更细粒度的位置信息,你可以使用stat_density2d中的stat_density2d函数计算和绘制密度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM