[英]How to combine two rows based on multiple columns in a dataframe?
The title says it all, I have a large dataset that consists of factory and latitude and longitude, and among others.标题说明了一切,我有一个大型数据集,其中包括工厂和纬度和经度等。 some of the factories I find have identical lat long although their name slightly different.
我发现一些工厂的经纬度相同,尽管它们的名称略有不同。 How can I combine rows of factories that have the same lat-long in R?
如何在 R 中组合具有相同经纬度的工厂行?
mill![]() |
latitude![]() |
longitude![]() |
ID ![]() |
---|---|---|---|
a.![]() |
12.34. ![]() |
7.86. ![]() |
NA![]() |
A.![]() |
12.34. ![]() |
7.86. ![]() |
4 ![]() |
b ![]() |
47.56. ![]() |
27.07. ![]() |
5. ![]() |
The output I am looking for is:我正在寻找的 output 是:
mill![]() |
latitude![]() |
longitude![]() |
ID ![]() |
---|---|---|---|
A.![]() |
12.34. ![]() |
7.86. ![]() |
4 ![]() |
b.![]() |
47.56. ![]() |
27.07. ![]() |
5 ![]() |
Base R底座 R
aggregate(.~latitude+longitude,df,tail,1)
latitude longitude mill ID
1 12.34 7.86 A. 4
2 47.56 27.07 b 5
We can use我们可以用
library(dplyr)
df1 %>%
arrange(latitude, longitude, is.na(ID)) %>%
distinct(latitude, longitude, .keep_all = TRUE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.