[英]How can I transform my data frame into this specific format in R?
My current data frame in R
has only 2 columns, namely longitude and latitude. 我当前在
R
数据框只有两列,即经度和纬度。 There are around 1500 records (rows) and they include lots of duplicates. 大约有1500条记录(行),其中包括很多重复项。
An extract of the data frame is shown below: 数据帧的摘录如下所示:
longitude latitude
57.408999 -20.208104
57.667991 -20.13641
57.539122 -20.103416
57.502332 -20.124798
57.414653 -20.261872
57.65949 -20.126768
57.468383 -20.223031
57.754464 -20.25823
57.754464 -20.25823
57.680745 -20.121893
57.65949 -20.179457
57.669408 -20.177538
57.702715 -20.211515
I want to convert this data frame into the following format: 我想将此数据帧转换为以下格式:
longitude latitude emp emp2
57.408999 -20.208104 1 0.1
57.667991 -20.13641 11 1.1
57.539122 -20.103416 16 1.6
57.502332 -20.124798 10 1
57.414653 -20.261872 1 0.1
57.65949 -20.126768 2 0.2
57.468383 -20.223031 17 1.7
57.754464 -20.25823 9 0.9
57.754464 -20.25823 13 1.3
57.680745 -20.121893 13 1.3
57.65949 -20.179457 4 0.4
57.669408 -20.177538 3 0.3
57.702715 -20.211515 1 0.1
emp
will be a new column which is the frequency of each longitude and latitude. emp
将是一个新列,它是每个经度和纬度的频率。 Thus my data frame will now only have unique longitude and latitude with their respective counts. 因此,我的数据框现在将仅具有唯一的经度和纬度及其各自的计数。
emp2
is simply the value of emp
divided by 10 emp2
只是emp
的值除以10
Can this be done with R? 可以用R完成吗? If yes, any help would be highly appreciated.
如果是,任何帮助将不胜感激。
Since I am new to R, I am confused as to where to start to solve the issue. 由于我是R的新手,所以对于从哪里开始解决问题感到困惑。
An easy way with dplyr
would be 使用
dplyr
的简单方法是
library(dplyr)
df %>%
group_by(longitude, latitude) %>%
summarise(emp = n(),
emp2 = emp/10)
Alternative base R
solution using aggregate
. 使用
aggregate
替代base R
解决方案。
attach(df)
df <- aggregate(df, by=list(longitude, latitude), FUN=length)
colnames(df) <- c('longitude', 'latitude', 'emp', 'emp2')
df$emp2 <- df$emp2 / 10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.