简体   繁体   English

如何在R中将数据框转换为这种特定格式?

[英]How can I transform my data frame into this specific format in R?

My current data frame in R has only 2 columns, namely longitude and latitude. 我当前在R数据框只有两列,即经度和纬度。 There are around 1500 records (rows) and they include lots of duplicates. 大约有1500条记录(行),其中包括很多重复项。

An extract of the data frame is shown below: 数据帧的摘录如下所示:

longitude   latitude
57.408999   -20.208104
57.667991   -20.13641
57.539122   -20.103416
57.502332   -20.124798
57.414653   -20.261872
57.65949    -20.126768
57.468383   -20.223031
57.754464   -20.25823
57.754464   -20.25823
57.680745   -20.121893
57.65949    -20.179457
57.669408   -20.177538
57.702715   -20.211515

I want to convert this data frame into the following format: 我想将此数据帧转换为以下格式:

longitude   latitude    emp emp2
57.408999   -20.208104  1   0.1
57.667991   -20.13641   11  1.1
57.539122   -20.103416  16  1.6
57.502332   -20.124798  10  1
57.414653   -20.261872  1   0.1
57.65949    -20.126768  2   0.2
57.468383   -20.223031  17  1.7
57.754464   -20.25823   9   0.9
57.754464   -20.25823   13  1.3
57.680745   -20.121893  13  1.3
57.65949    -20.179457  4   0.4
57.669408   -20.177538  3   0.3
57.702715   -20.211515  1   0.1

emp will be a new column which is the frequency of each longitude and latitude. emp将是一个新列,它是每个经度和纬度的频率。 Thus my data frame will now only have unique longitude and latitude with their respective counts. 因此,我的数据框现在将仅具有唯一的经度和纬度及其各自的计数。

emp2 is simply the value of emp divided by 10 emp2只是emp的值除以10

Can this be done with R? 可以用R完成吗? If yes, any help would be highly appreciated. 如果是,任何帮助将不胜感激。

Since I am new to R, I am confused as to where to start to solve the issue. 由于我是R的新手,所以对于从哪里开始解决问题感到困惑。

An easy way with dplyr would be 使用dplyr的简单方法是

library(dplyr)
df %>%
  group_by(longitude, latitude) %>%
  summarise(emp = n(), 
            emp2 = emp/10)

Alternative base R solution using aggregate . 使用aggregate替代base R解决方案。

attach(df)
df <- aggregate(df, by=list(longitude, latitude), FUN=length)
colnames(df) <- c('longitude', 'latitude', 'emp', 'emp2')
df$emp2 <- df$emp2 / 10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM